Compositions and methods for detecting sessile serrated adenomas/polyps

ABSTRACT

The disclosure provides a method to detect sessile serrated adenomas/polyps (SSA/Ps) and to differentiate SSA/Ps from hyperplastic polyps (HPs). The method uses a molecular signature that is platform-independent and could be used with multiple platforms such as microarray, RNA-seq or real-time quantitative platforms.

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of International application numberPCT/US2018/020517, filed Mar. 1, 2018, which claims the benefit of U.S.Provisional Application No. 62/465,588, filed Mar. 1, 2017, thedisclosures of which are hereby incorporated by reference in theirentirety.

GOVERNMENTAL RIGHTS

This invention was made with government support under CA176130 awardedby the National Institutes of Health. The government has certain rightsin the invention.

FIELD OF THE INVENTION

The disclosure provides a method to detect sessile serratedadenomas/polyps (SSA/Ps) and to differentiate SSA/Ps from hyperplasticpolyps (HPs). The method uses a molecular signature that isplatform-independent and could be used with multiple platforms such asmicroarray, RNA-seq or real-time qPCR platforms.

BACKGROUND OF THE INVENTION

Colon cancer is the second largest cause of cancer-related deaths in theUnited States. Colonic neoplasms originate primarily from colon polyps,and develop via partially overlapping but mechanistically distinctpathways that have been designated as the adenomatous and serratedpathways. Accumulating evidence indicates that the majority of othercolon adenocarcinomas, possibly 20-30%, arise from a subset of serratedpolyps, designated sessile serrated adenomas/polyps (SSA/Ps), which werepreviously classified as hyperplastic polyps and thought to have littleor no tumorigenic potential.

Sessile serrated adenomas/polyps (SSA/Ps) have been distinguished fromhyperplastic polyps (HPs) on the basis of their endoscopic appearance(larger, flat and hypermucinous) and histologic characteristics(dilatated crypts, horizontal crypts, and boot shaped deformities).However, because HPs may often have overlapping similar features,including serrated crypt architecture, borderline phenotypes can bedifficult to assign. This has been highlighted by a number of studiesdocumenting the frequent misclassification of SSA/Ps as HPs, resultingin inadequate follow-up. Conversely, misclassifying an HP as an SSA/Pmay result in unnecessary cancer screening in these patients. SSA/Psaccount for 20-30% of colon cancers whereas HPs have little or no riskof progressing to colon cancer.

Thus, there is a need in the art for reliable diagnostic assays thatcould aid in the distinction between these lesions. Such an assay wouldbe helpful for both diagnosis and surveillance stratification ofpatients.

SUMMARY OF THE INVENTION

In an aspect, the disclosure provides a method of detecting sessileserrated adenomas/polyps (SSA/Ps) in a subject. The method comprises:(a) determining the level of expression of the nucleic acids in themolecular signature in a biological sample obtained from the subject,wherein the molecular signature is selected from the group consisting ofCHFR, CHGA, CLDN1, KIZ, MEGF6, NTRK2, PLA2G16, PTAFR, SBSPON, SEMG1,SLC7A9, SPIRE1, TACSTD2, FOXD1, PIK3R3, PRUNE2, TPD52L1, TRIB2, C4BPA,CPE, DPP10, GRAMD1B, GRIN2D, KLK7, MYCN, and TM4SF4; (b) comparing thelevel of expression of each nucleic acid in the molecular signature to areference value; and (c) detecting SSA/Ps in the subject based on thelevel of expression of each nucleic acid in the molecular signaturerelative to the reference value.

In another aspect, the disclosure provides a method of differentiatingsessile serrated adenomas/polyps (SSA/Ps) from hyperplastic polyps (HPs)in a subject. The method comprises: (a) determining the level ofexpression of the nucleic acids in the molecular signature in abiological sample obtained from the subject, wherein the molecularsignature is selected from the group consisting of CHFR, CHGA, CLDN1,KIZ, MEGF6, NTRK2, PLA2G16, PTAFR, SBSPON, SEMG1, SLC7A9, SPIRE1,TACSTD2, FOXD1, PIK3R3, PRUNE2, TPD52L1, TRIB2, C4BPA, CPE, DPP10,GRAMD1B, GRIN2D, KLK7, MYCN, and TM4SF4; (b) comparing the level ofexpression of each nucleic acid in the molecular signature to areference value; and (c) detecting SSA/Ps or HPs in the subject based onthe level of expression of each nucleic acid in the molecular signaturerelative to the reference value.

In still another aspect, the disclosure provides a method of predictingthe likelihood that a colorectal polyp in a subject will develop intocolorectal cancer. The method comprises: (a) determining the level ofexpression of the nucleic acids in the molecular signature in abiological sample obtained from the subject, wherein the molecularsignature is selected from the group consisting of CHFR, CHGA, CLDN1,KIZ, MEGF6, NTRK2, PLA2G16, PTAFR, SBSPON, SEMG1, SLC7A9, SPIRE1,TACSTD2, FOXD1, PIK3R3, PRUNE2, TPD52L1, TRIB2, C4BPA, CPE, DPP10,GRAMD1B, GRIN2D, KLK7, MYCN, and TM4SF4; (b) comparing the level ofexpression of each nucleic acid in the molecular signature to areference value; and (c) detecting SSA/Ps in the subject based on thelevel of expression of each nucleic acid in the molecular signaturerelative to the reference value, wherein the detection of SSA/Ps in thesubject indicates an increased likelihood of developing colorectalcancer.

In still yet another aspect, the disclosure provides a method ofdetermining treatment of a subject diagnosed with serrated polyps orsuspected of having serrated polyps. The method comprises: (a)determining the level of expression of the nucleic acids in themolecular signature in a biological sample obtained from the subject,wherein the molecular signature is selected from the group consisting ofCHFR, CHGA, CLDN1, KIZ, MEGF6, NTRK2, PLA2G16, PTAFR, SBSPON, SEMG1,SLC7A9, SPIRE1, TACSTD2, FOXD1, PIK3R3, PRUNE2, TPD52L1, TRIB2, C4BPA,CPE, DPP10, GRAMD1B, GRIN2D, KLK7, MYCN, and TM4SF4; (b) comparing thelevel of expression of each nucleic acid in the molecular signature to areference value; (c) detecting SSA/Ps in the subject based on the levelof expression of each nucleic acid in the molecular signature relativeto the reference value; and (d) treating the subject more aggressivelyif SSA/Ps are detected.

Additionally, the disclosure provides a kit to differentiate SSA/Ps andHPs in a subject. The kit comprises detection agents that can detect theexpression products of a molecular signature in a biological sampleobtained from the subject, wherein the molecular signature is selectedfrom the group consisting of CHFR, CHGA, CLDN1, KIZ, MEGF6, NTRK2,PLA2G16, PTAFR, SBSPON, SEMG1, SLC7A9, SPIRE1, TACSTD2, FOXD1, PIK3R3,PRUNE2, TPD52L1, TRIB2, C4BPA, CPE, DPP10, GRAMD1B, GRIN2D, KLK7, MYCN,and TM4SF4.

BRIEF DESCRIPTION OF THE FIGURES

The application file contains at least one drawing executed in color.Copies of this patent application publication with color drawing(s) willbe provided by the Office upon request and payment of the necessary fee.

FIG. 1 depicts a Venn diagram summarizing the differentially expressed(DE) genes in three comparisons.

FIG. 2A, FIG. 2B, FIG. 2C, and FIG. 2D depicts principle componentanalysis (PCA) scatter plots. (FIG. 2A) SSA/P and HP samples are notwell-separated when all the expressed genes are considered; (FIG. 2B)control right (CR) and control left (CL) samples are well-separated whenall the expressed genes are considered; (FIG. 2C) SSA/P and HP samplesare well-separated when only the genes differentially expressed betweenSSA/Ps and HPs with the exclusion of genes, DE between CR and CL areconsidered (139 genes); (FIG. 2D) CR and CL samples are well-separatedwhen only the 152 genes in (FIG. 2C) are considered.

FIG. 3 depicts a heatmap of RNA-seq expression data. Hierarchicalclustering of CR (green), HP (yellow) and SSA/Ps (blue) biopsies(columns) and differentially expressed genes (rows). Only genes thatwere expressed at the same level in HP and CR samples but significantlyup- or down-regulated in SSA/Ps are shown. Down-regulated andup-regulated genes in SSA/P are indicated in blue and orange colors,respectively. The log₂(SSA/P/HP) is shown next to gene names on theright side.

FIG. 4 depicts MST2 of the ‘Golgi stack’ gene set from the C5 collectionof MSigDB. This gene set was detected by GSNCA (P<0.05) in bothcomparisons: HPs versus SSA/Ps (FIG. 4A) and CRs versus SSA/Ps (FIG.4B).

FIG. 5A, FIG. 5B, FIG. 5C, and FIG. 5D depicts examples, illustratingthe new feature selection step. (FIG. 5A) The fold change in bothplatforms was larger than the within-phenotype variability and thecorrelation coefficient between platforms (ρ_(true)) was high; (FIG. 5B)when phenotypic labels in part A were randomly resampled, the foldchange in both platforms became negligible as compared to thewithin-phenotype variability and the correlation coefficient betweenplatforms (ρ_(random)) became low. (FIG. 5C) The fold change in bothplatforms was smaller than the within-phenotype variability and thecorrelation coefficient between platforms (ρ_(true)) was low; (FIG. 5D)when phenotypic labels in FIG. 5C were randomly resampled, thecorrelation coefficient (ρ_(random)) was low.

FIG. 6 depicts the probability of an assigned SSA/P (HP) class is thecumulative distribution function CDF(SM) (1-CDF(SM)) of the empiricaldistribution of SM after standardization. The empirical approach canalso be substituted by the normal approximation of SM. Since bothapproaches have limitations, the Cantelli lower bound (CLB) is used as aconservative probability assignment for the SM score.

FIG. 7 depicts MST2 of the MEIOSIS gene set of the C5 collectionobtained from MSigDB. This gene set is detected by GSNCA (P<0.05) inboth comparisons: HP versus SSA/P (FIG. 7A) and CR versus SSA/P (FIG.7B).

FIG. 8 depicts MST2 of the REGULATION OF DNA REPLICATION gene set of theC5 collection obtained from MSigDB. This gene set is detected by GSNCA(P<0.05) in both comparisons: HP versus SSA/P (FIG. 8A) and CR versusSSA/P (FIG. 8B).

FIG. 9 depicts MST2 of the PROTEIN TARGETING TO MEMBRANE gene set of theC5 collection obtained from MSigDB. This gene set is detected by GSNCA(P<0.05) in both comparisons: HP versus SSA/P (FIG. 9A) and CR versusSSA/P (FIG. 9B).

FIG. 10 depicts MST2 of the MEIOTIC RECOMBINATION gene set from the C5collection obtained from MSigDB. This gene set is detected by GSNCA(P<0.05) in both comparisons: HP versus SSA/P (FIG. 10A) and CR versusSSA/P (FIG. 10B).

FIG. 11 depicts MST2 of the KINASE ACTIVATOR ACTIVITY gene set from theC5 collection obtained from MSigDB. This gene set is detected by GSNCA(P<0.05) in both comparisons: HP versus SSA/P (FIG. 11A) and CR versusSSA/P (FIG. 11B).

FIG. 12 depicts MST2 of the HORMONE ACTIVITY gene set from the C5collection obtained from MSigDB. This gene set is detected by GSNCA(P<0.05) in both comparisons: HP versus SSA/P (FIG. 12A) and CR versusSSA/P (FIG. 12B).

FIG. 13A, FIG. 13B, FIG. 13C, and FIG. 13D depict histograms of thePearson correlation coefficient between two platforms obtained in 10000iterations. Only 117 genes expressed in all three platforms (RNA-seq,Illumina, and Affymetrix) and found to be differentially expressedbetween SSA/Ps and both HPs and CRs are considered. (FIG. 13A)correlation between the RNA-seq and the Illumina platforms whenphenotypic labels are preserved; (FIG. 13B) correlation between theRNA-seq and the Illumina platforms when phenotypic labels are randomlyresampled; (FIG. 13C) correlation between the RNA-seq and the Affymetrixplatforms when phenotypic labels are preserved; (FIG. 13D) correlationbetween the RNA-seq and the Affymetrix platforms when phenotypic labelsare randomly resampled.

FIG. 14A, FIG. 14B, and FIG. 14C depict histograms of the MAD-normalizedlog-scale gene expression data in all three platforms approximatelyfollows a Laplace-like distribution centered around zero; (FIG. 14A)RNA-seq dataset (17243 genes and 31 samples); (FIG. 14B) Illuminadataset (17123 genes and 12 samples); (FIG. 14C) Affymetrix dataset(19090 genes and 17 samples).

FIG. 15A, FIG. 15B, and FIG. 15C depict histograms of the summary metric(SM) obtained by summing the MAD-normalized expressions of a randomsignature of 15 genes in all three platforms. Six HP and six SSA/Psamples were randomly selected from each platform in each iteration anda total of 10000 iterations were used to generate the histogram of SM.The SM approximately follows a normal-like distribution that is centeredaround zero and has a higher kurtosis than the standard normaldistribution; (FIG. 15A) RNA-seq data set; (FIG. 15B) Illumina data set;(FIG. 15C) Affymetrix data set.

FIG. 16 depicts a principle component analysis (PCA) scatter plotshowing the first and second components for normalized expression levelsby first subtracting sample medians and then by subtracting gene-wisemedians from each individual gene.

FIG. 17 depicts a barplot of the average raw expression levels of 13genes obtained by qPCR from 45 FFPE tissue samples. For each gene,samples are grouped according to their phenotype (HP or SSA/P). Errorbars extend to ±one standard deviation. Raw expression levels arerelative to the housekeeping genes, hence higher levels here refer tolower values.

FIG. 18A and FIG. 18B depict boxplots for the expression levels of 13genes obtained by qPCR from 45 FFPE tissue samples. (FIG. 18A) rawexpression levels centered around zero; (FIG. 18B) normalized expressionlevels by first subtracting sample medians and then by subtractinggene-wise medians from each individual gene.

DETAILED DESCRIPTION OF THE INVENTION

Provided herein are methods to detect sessile serrated adenomas/polyps(SSA/Ps) and to distinguish SSA/Ps from hyperplastic polyps (HPs). Priorto the disclosure, there has been difficulty in distinguishing SSA/Psfrom HPs. Current histopathological methods have about 60-70% accuracyin distinguishing SSA/Ps from HPs. However, the methodology disclosedherein has an impressive 90% accuracy at correctly distinguishing SSA/Psfrom HPs. Notably, the molecular signature disclosed herein was able toachieve this accuracy on preserved FFPE tissues. Further, the molecularsignature was developed such that it is platform-independent and couldbe used with multiple platforms such as microarray, RNA-seq or real-timeqPCR platforms to effectively distinguish SSA/Ps from HPs. As SSA/Pshave a higher risk of progressing to cancer, it is important that SSA/Psare accurately diagnosed such that the subject is treated properly. Byaccurately detecting SSA/Ps, the subject may be treated moreaggressively or monitored more frequently. Thus, the method disclosedherein may be used to determine the risk of progression to colorectalcancer and also decrease the risk of progression to colorectal cancer byallowing for earlier interventions.

Details of the methods are described in more detail below.

I. Molecular Signature

In an aspect, the disclosure provides a molecular signature fordifferentiating sessile serrated adenomas/polyps (SSA/Ps) andhyperplastic polyps (HPs) in a subject. As used herein, the term“molecular signature” refers to a set of nucleic acids that aredifferentially expressed in a subject. For example, serrated polyps maybe classified into hyperplastic polyps (HPs), sessile serratedadenomas/polyps (SSA/Ps), and traditional serrated adenomas (TSAs) andthe expression levels of the nucleic acids in the molecular signaturemay be used to differentiate SSA/Ps and HPs. Accordingly, the molecularsignature may also be used to predict prognosis, predict development ofcolorectal cancer, develop a treatment strategy, develop afollow-up/monitoring strategy, determine response to treatment, monitorprogression of disease, etc.

In one embodiment, the molecular signature comprises at least 3, atleast 4, at least 5, at least 6, at least 7, at least 8, at least 9, atleast 10, at least 11, at least 12, at least 13, at least 14, at least15, at least 16, or at least 17 nucleic acids selected from the groupconsisting of C4BPA, CHGA, CLDN1, CPE, DPP10, GRAMD1B, GRIN2D, KIZ,KLK7, MEGF6, MYCN, NTRK2, PLA2G16, SBSPON, SEMG1, SLC7A9, SPIRE1, andTM4SF4. Specifically, the molecular signature comprises 18 nucleic acidsselected from the group consisting of C4BPA, CHGA, CLDN1, CPE, DPP10,GRAMD1B, GRIN2D, KIZ, KLK7, MEGF6, MYCN, NTRK2, PLA2G16, SBSPON, SEMG1,SLC7A9, SPIRE1, and TM4SF4.

In another embodiment, the molecular signature comprises at least 3, atleast 4, at least 5, at least 6, at least 7, at least 8, at least 9, atleast 10, at least 11, at least 12, at least 13, at least 14, or atleast 15 nucleic acids selected from the group consisting of CLDN1,FOXD1, KIZ, MEGF6, NTRK2, PIK3R3, PLA2G16, PRUNE2, PTAFR, SBSPON, SEMG1,SLC7A9, SPIRE1, TACSTD2, TPD52L1, and TRIB2. Specifically, the molecularsignature comprises 16 nucleic acids selected from the group consistingof CLDN1, FOXD1, KIZ, MEGF6, NTRK2, PIK3R3, PLA2G16, PRUNE2, PTAFR,SBSPON, SEMG1, SLC7A9, SPIRE1, TACSTD2, TPD52L1, and TRIB2.

In still another embodiment, the molecular signature comprises at least3, at least 4, at least 5, at least 6, at least 7, at least 8, at least9, at least 10, at least 11, or at least 12 nucleic acids selected fromthe group consisting of CHFR, CHGA, CLDN1, KIZ, MEGF6, NTRK2, PLA2G16,PTAFR, SBSPON, SEMG1, SLC7A9, SPIRE1, and TACSTD2. Specifically, themolecular signature comprises 13 nucleic acids selected from the groupconsisting of CHFR, CHGA, CLDN1, KIZ, MEGF6, NTRK2, PLA2G16, PTAFR,SBSPON, SEMG1, SLC7A9, SPIRE1, and TACSTD2.

Alternatively, a molecular signature of the disclosure may comprise 3 to10, 10 to 20, 20 to 30, 30 to 50, 50 to 100, 100 to 200, 200 to 300, 300to 400 and more than 400 nucleic acids. In one embodiment, a molecularsignature of the disclosure may comprise at least 3, at least 4, atleast 5, at least 6, at least 7, at least 8, at least 9, at least 10, atleast 11, at least 12, at least 13, at least 14, at least 15, at least16, at least 17, at least 18, at least 19, at least 20, at least 21, atleast 22, at least 23, at least 24, at least 25, or all 26 nucleic acidsfrom Table A. In addition, other nucleic acids not herein described maybe combined with any of the presently disclosed nucleic acids to aid inthe differentiation of sessile serrated adenomas/polyps (SSA/Ps) andhyperplastic polyps (HPs). A skilled artisan would be able to determinethe various sequences of the nucleic acids listed in Table A. Nucleicacids have transcript variants due to alternative splicing. A skilledartisan would be able to determine various transcript variants from theaccession numbers provided.

TABLE A Nucleic acids for molecular signature. Homo sapiens GeneAccession Name Description Number C4BPA complement component 4 bindingNM_000715.3 protein alpha CHFR checkpoint with forkhead and ring fingerNM_001161344.1 domains, E3 ubiquitin protein ligase CHGA chromogranin ANM_001275.3 CLDN1 claudin 1 NM_021101.4 CPE carboxypeptidase ENM_001873.3 DPP10 dipeptidyl peptidase like 10 NM_020868.4 FOXD1forkhead box D1 NM_004472.2 GRAMD1B GRAM domain containing 1BNM_001286563.1 GRIN2D glutamate ionotropic receptor NMDA typeNM_000836.2 subunit 2D KIZ kizuna centrosomal protein NM_018474.4 KLK7kallikrein related peptidase 7 NM_005046.3 MEGF6 multiple EGF likedomains 6 NM_001409.3 MYCN v-myc avian myelocytomatosis viral oncogeneNM_001293228.1 neuroblastoma derived homolog NTRK2 neurotrophic tyrosinekinase, receptor, type 2 NM_006180.4 PIK3R3 phosphoinositide-3-kinaseregulatory subunit 3 NM_003629.3 PLA2G16 phospholipase A2 group XVINM_007069.3 PRUNE2 prune homolog 2 NM_015225.2 PTAFR platelet activatingfactor receptor NM_001164721.1 SBSPON somatomedin B and thrombospondintype NM_153225.3 1 domain containing SEMG1 semenogelin I NM_003007.4SLC7A9 solute carrier family 7 member 9 NM_014270.4 SPIRE1 spire typeactin nucleation factor 1 NM_001128626.1 TACSTD2 tumor-associatedcalcium signal transducer 2 NM_002353.2 TM4SF4 transmembrane 4 L sixfamily member 4 NM_004617.3 TPD52L1 tumor protein D52-like 1 NM_003287.3TRIB2 tribbles pseudokinase 2 NM_021643.3

The molecular signature may further comprise one or more nucleic acidsused as a normalization control. A normalization control compensates forsystemic technical differences between experiments, to see more clearlythe systemic biological differences between samples. A normalizationcontrol is a nucleic acid whose expression is not expected to bedifferent across samples. Generally, these nucleic acids may be known as‘housekeeping’ nucleic acids which are required for basic cellprocesses. Non-limiting examples of housekeeping nucleic acids commonlyused as normalization controls include GAPDH, ACTB, B2M, TUBA, G6PD,LDHA, HPRT, ALDOA, PFKP, PGK1, PGAM1, VIM and UBC.

II. Methods

In an aspect, the disclosure provides a method to classify a subjectbased on the level of expression of the nucleic acids in a molecularsignature of the disclosure. The method generally comprises: (a)determining the level of expression of the nucleic acids in a molecularsignature of the disclosure in a biological sample obtained from thesubject; (b) comparing the level of expression of each nucleic acid inthe molecular signature to a reference value; and (c) classifying thesubject based on the level of expression of each nucleic acid in themolecular signature relative to the reference value. In an embodiment,the molecular signature comprises 18 nucleic acids selected from thegroup consisting of C4BPA, CHGA, CLDN1, CPE, DPP10, GRAMD1B, GRIN2D,KIZ, KLK7, MEGF6, MYCN, NTRK2, PLA2G16, SBSPON, SEMG1, SLC7A9, SPIRE1,and TM4SF4. In another embodiment, the molecular signature comprises 16nucleic acids selected from the group consisting of CLDN1, FOXD1, KIZ,MEGF6, NTRK2, PIK3R3, PLA2G16, PRUNE2, PTAFR, SBSPON, SEMG1, SLC7A9,SPIRE1, TACSTD2, TPD52L1, and TRIB2. In still another embodiment, themolecular signature comprises 13 nucleic acids selected from the groupconsisting of CHFR, CHGA, CLDN1, KIZ, MEGF6, NTRK2, PLA2G16, PTAFR,SBSPON, SEMG1, SLC7A9, SPIRE1, and TACSTD2.

In another aspect, the disclosure provides a method of detecting sessileserrated adenomas/polyps (SSA/Ps) in a subject. The method comprises:(a) determining the level of expression of the nucleic acids in amolecular signature of the disclosure in a biological sample obtainedfrom the subject; (b) comparing the level of expression of each nucleicacid in the molecular signature to a reference value; and (c) detectingSSA/Ps in the subject based on the level of expression of each nucleicacid in the molecular signature relative to the reference value. In anembodiment, the molecular signature comprises 18 nucleic acids selectedfrom the group consisting of C4BPA, CHGA, CLDN1, CPE, DPP10, GRAMD1B,GRIN2D, KIZ, KLK7, MEGF6, MYON, NTRK2, PLA2G16, SBSPON, SEMG1, SLC7A9,SPIRE1, and TM4SF4. In another embodiment, the molecular signaturecomprises 16 nucleic acids selected from the group consisting of CLDN1,FOXD1, KIZ, MEGF6, NTRK2, PIK3R3, PLA2G16, PRUNE2, PTAFR, SBSPON, SEMG1,SLC7A9, SPIRE1, TACSTD2, TPD52L1, and TRIB2. In still anotherembodiment, the molecular signature comprises 13 nucleic acids selectedfrom the group consisting of CHFR, CHGA, CLDN1, KIZ, MEGF6, NTRK2,PLA2G16, PTAFR, SBSPON, SEMG1, SLC7A9, SPIRE1, and TACSTD2.Specifically, step (c) comprises detecting SSA/Ps in the subject whenCHFR, CHGA, and NTRK2 are decreased relative to the reference value andwhen CLDN1, KIZ, MEGF6, PLA2G16, PTAFR, SBSPON, SEMG1, SLC7A9, SPIRE1,and TACSTD2 are increased relative to the reference value, wherein thereference value is the level of expression of each nucleic acid in themolecular signature in a non-diseased or HP sample. Additionally, step(c) comprises detecting SSA/Ps in the subject when NTRK2 is decreasedrelative to the reference value and when CLDN1, FOXD1, KIZ, MEGF6,PIK3R3, PLA2G16, PRUNE2, PTAFR, SBSPON, SEMG1, SLC7A9, SPIRE1, TACSTD2,TPD52L1, and TRIB2 are increased relative to the reference value,wherein the reference value is the level of expression of each nucleicacid in the molecular signature in a non-diseased or HP sample. Further,step (c) comprises detecting SSA/Ps in the subject when CHGA, CPE,DPP10, and NTRK2 are decreased relative to the reference value and whenC4BPA, CLDN1, GRAMD1B, GRIN2D, KIZ, KLK7, MEGF6, MYCN, PLA2G16, SBSPON,SEMG1, SLC7A9, SPIRE1, and TM4SF4 are increased relative to thereference value, wherein the reference value is the level of expressionof each nucleic acid in the molecular signature in a non-diseased or HPsample.

In still another aspect, the disclosure provides a method ofdifferentiating sessile serrated adenomas/polyps (SSA/Ps) fromhyperplastic polyps (HPs) in a subject. The method comprises: (a)determining the level of expression of the nucleic acids in a molecularsignature of the disclosure in a biological sample obtained from thesubject; (b) comparing the level of expression of each nucleic acid inthe molecular signature to a reference value; and (c) detecting SSA/Psor HPs in the subject based on the level of expression of each nucleicacid in the molecular signature relative to the reference value. In anembodiment, the molecular signature comprises 18 nucleic acids selectedfrom the group consisting of C4BPA, CHGA, CLDN1, CPE, DPP10, GRAMD1B,GRIN2D, KIZ, KLK7, MEGF6, MYCN, NTRK2, PLA2G16, SBSPON, SEMG1, SLC7A9,SPIRE1, and TM4SF4. In another embodiment, the molecular signaturecomprises 16 nucleic acids selected from the group consisting of CLDN1,FOXD1, KIZ, MEGF6, NTRK2, PIK3R3, PLA2G16, PRUNE2, PTAFR, SBSPON, SEMG1,SLC7A9, SPIRE1, TACSTD2, TPD52L1, and TRIB2. In still anotherembodiment, the molecular signature comprises 13 nucleic acids selectedfrom the group consisting of CHFR, CHGA, CLDN1, KIZ, MEGF6, NTRK2,PLA2G16, PTAFR, SBSPON, SEMG1, SLC7A9, SPIRE1, and TACSTD2.Specifically, step (c) comprises detecting SSA/Ps in the subject whenCHFR, CHGA, and NTRK2 are decreased relative to the reference value andwhen CLDN1, KIZ, MEGF6, PLA2G16, PTAFR, SBSPON, SEMG1, SLC7A9, SPIRE1,and TACSTD2 are increased relative to the reference value, wherein thereference value is the level of expression of each nucleic acid in themolecular signature in a non-diseased or HP sample. Additionally, step(c) comprises detecting SSA/Ps in the subject when NTRK2 is decreasedrelative to the reference value and when CLDN1, FOXD1, KIZ, MEGF6,PIK3R3, PLA2G16, PRUNE2, PTAFR, SBSPON, SEMG1, SLC7A9, SPIRE1, TACSTD2,TPD52L1, and TRIB2 are increased relative to the reference value,wherein the reference value is the level of expression of each nucleicacid in the molecular signature in a non-diseased or HP sample. Further,step (c) comprises detecting SSA/Ps in the subject when CHGA, CPE,DPP10, and NTRK2 are decreased relative to the reference value and whenC4BPA, CLDN1, GRAMD1B, GRIN2D, KIZ, KLK7, MEGF6, MYCN, PLA2G16, SBSPON,SEMG1, SLC7A9, SPIRE1, and TM4SF4 are increased relative to thereference value, wherein the reference value is the level of expressionof each nucleic acid in the molecular signature in a non-diseased or HPsample.

In still yet another aspect, the disclosure provides a method ofpredicting the likelihood that a colorectal polyp in a subject willdevelop into colorectal cancer. The method comprises: (a) determiningthe level of expression of the nucleic acids in a molecular signature ofthe disclosure in a biological sample obtained from the subject; (b)comparing the level of expression of each nucleic acid in the molecularsignature to a reference value; and (c) detecting SSA/Ps in the subjectbased on the level of expression of each nucleic acid in the molecularsignature relative to the reference value, wherein the detection ofSSA/Ps in the subject indicates an increased likelihood of developingcolorectal cancer. Treatment decisions may then be made based on thedetection of SSA/Ps. In an embodiment, the molecular signature comprises18 nucleic acids selected from the group consisting of C4BPA, CHGA,CLDN1, CPE, DPP10, GRAMD1B, GRIN2D, KIZ, KLK7, MEGF6, MYCN, NTRK2,PLA2G16, SBSPON, SEMG1, SLC7A9, SPIRE1, and TM4SF4. In anotherembodiment, the molecular signature comprises 16 nucleic acids selectedfrom the group consisting of CLDN1, FOXD1, KIZ, MEGF6, NTRK2, PIK3R3,PLA2G16, PRUNE2, PTAFR, SBSPON, SEMG1, SLC7A9, SPIRE1, TACSTD2, TPD52L1,and TRIB2. In still another embodiment, the molecular signaturecomprises 13 nucleic acids selected from the group consisting of CHFR,CHGA, CLDN1, KIZ, MEGF6, NTRK2, PLA2G16, PTAFR, SBSPON, SEMG1, SLC7A9,SPIRE1, and TACSTD2. Specifically, step (c) comprises detecting SSA/Psin the subject when CHFR, CHGA, and NTRK2 are decreased relative to thereference value and when CLDN1, KIZ, MEGF6, PLA2G16, PTAFR, SBSPON,SEMG1, SLC7A9, SPIRE1, and TACSTD2 are increased relative to thereference value, wherein the reference value is the level of expressionof each nucleic acid in the molecular signature in a non-diseased or HPsample. Additionally, step (c) comprises detecting SSA/Ps in the subjectwhen NTRK2 is decreased relative to the reference value and when CLDN1,FOXD1, KIZ, MEGF6, PIK3R3, PLA2G16, PRUNE2, PTAFR, SBSPON, SEMG1,SLC7A9, SPIRE1, TACSTD2, TPD52L1, and TRIB2 are increased relative tothe reference value, wherein the reference value is the level ofexpression of each nucleic acid in the molecular signature in anon-diseased or HP sample. Further, step (c) comprises detecting SSA/Psin the subject when CHGA, CPE, DPP10, and NTRK2 are decreased relativeto the reference value and when C4BPA, CLDN1, GRAMD1B, GRIN2D, KIZ,KLK7, MEGF6, MYCN, PLA2G16, SBSPON, SEMG1, SLC7A9, SPIRE1, and TM4SF4are increased relative to the reference value, wherein the referencevalue is the level of expression of each nucleic acid in the molecularsignature in a non-diseased or HP sample.

In other aspects, the disclosure provides a method of determiningtreatment of a subject diagnosed with serrated polyps or suspected ofhaving serrated polyps. The method generally comprises: (a) determiningthe level of expression of the nucleic acids in a molecular signature ofthe disclosure in a biological sample obtained from the subject; (b)comparing the level of expression of each nucleic acid in the molecularsignature to a reference value; (c) detecting SSA/Ps in the subjectbased on the level of expression of each nucleic acid in the molecularsignature relative to the reference value; and (d) treating the subjectmore aggressively if SSA/Ps are detected. Serrated polyps may beclassified into hyperplastic polyps (HPs), sessile serratedadenomas/polyps (SSA/Ps), and traditional serrated adenomas (TSAs).SSA/Ps have the strongest association with an increased risk for coloncancer. Accordingly, if SSA/Ps are detected, the subject may be moreaggressively treated relative to treatment for HPs. Non-limitingexamples of treatment for SSA/Ps include polypectomy, endoscopicresection, and surgical resection, all followed with surveillance.Additionally or alternatively, if SSA/Ps are detected, the subject maybe subjected to an increased frequency of surveillance, such ascolonoscopy. For example, the subject may receive a colonoscopy aboutevery 1 to about every 6 years. Accordingly, if SSA/Ps are detected, thesubject may receive a colonoscopy about every 1 year, about every 2years, about every 3 years, about every 4 years, about every 5 years, orabout every 6 years. For example, a subject having a polyp classified asan SSA/P according to the methods detailed herein and the polyp havingdiameter of at least about 10 mm would have a subsequent colonoscopy inabout 2 years to about 4 years, or about 3 years. For example, a subjecthaving a polyp classified as an SSA/P according to the methods detailedherein and the polyp having of diameter of less than about 5 mm wouldhave a subsequent colonoscopy in about 4 years to about 6 years, orabout 5 years. A subject having a polyp classified as an SSA/P accordingto the methods detailed herein and being of diameter of about 5 mm toabout 10 mm would have a subsequent colonoscopy in about 2 years toabout 6 years, about 3 to about 5 years, or about 4 years. More frequentcolonoscopies may be suggested for subjects having multiple SSA/Ppolyps. By more accurately diagnosing a polyp as a SSA/P instead of as ahyperplastic polyp, a subject may be more frequently screened bycolonoscopy, leading to a reduced incidence of colon cancer and deathsdue to colon cancer. In an embodiment, the molecular signature comprises18 nucleic acids selected from the group consisting of C4BPA, CHGA,CLDN1, CPE, DPP10, GRAMD1B, GRIN2D, KIZ, KLK7, MEGF6, MYCN, NTRK2,PLA2G16, SBSPON, SEMG1, SLC7A9, SPIRE1, and TM4SF4. In anotherembodiment, the molecular signature comprises 16 nucleic acids selectedfrom the group consisting of CLDN1, FOXD1, KIZ, MEGF6, NTRK2, PIK3R3,PLA2G16, PRUNE2, PTAFR, SBSPON, SEMG1, SLC7A9, SPIRE1, TACSTD2, TPD52L1,and TRIB2. In still another embodiment, the molecular signaturecomprises 13 nucleic acids selected from the group consisting of CHFR,CHGA, CLDN1, KIZ, MEGF6, NTRK2, PLA2G16, PTAFR, SBSPON, SEMG1, SLC7A9,SPIRE1, and TACSTD2. Specifically, step (c) comprises detecting SSA/Psin the subject when CHFR, CHGA, and NTRK2 are decreased relative to thereference value and when CLDN1, KIZ, MEGF6, PLA2G16, PTAFR, SBSPON,SEMG1, SLC7A9, SPIRE1, and TACSTD2 are increased relative to thereference value, wherein the reference value is the level of expressionof each nucleic acid in the molecular signature in a non-diseased or HPsample. Additionally, step (c) comprises detecting SSA/Ps in the subjectwhen NTRK2 is decreased relative to the reference value and when CLDN1,FOXD1, KIZ, MEGF6, PIK3R3, PLA2G16, PRUNE2, PTAFR, SBSPON, SEMG1,SLC7A9, SPIRE1, TACSTD2, TPD52L1, and TRIB2 are increased relative tothe reference value, wherein the reference value is the level ofexpression of each nucleic acid in the molecular signature in anon-diseased or HP sample. Further, step (c) comprises detecting SSA/Psin the subject when CHGA, CPE, DPP10, and NTRK2 are decreased relativeto the reference value and when C4BPA, CLDN1, GRAMD1B, GRIN2D, KIZ,KLK7, MEGF6, MYCN, PLA2G16, SBSPON, SEMG1, SLC7A9, SPIRE1, and TM4SF4are increased relative to the reference value, wherein the referencevalue is the level of expression of each nucleic acid in the molecularsignature in a non-diseased or HP sample.

In other aspects, the disclosure provides a method for monitoringserrated polyps in a subject. In such an embodiment, a method ofdetecting sessile serrated adenomas/polyps (SSA/Ps) in a subject isperformed at one point in time. Then, at a later time, the method ofdetecting sessile serrated adenomas/polyps (SSA/Ps) in the subject maybe performed to determine the change in serrated polyps over time. Forexample, the method of detecting sessile serrated adenomas/polyps(SSA/Ps) may be performed on the same subject days, weeks, months, oryears following the initial use of the method to detect sessile serratedadenomas/polyps (SSA/Ps). Accordingly, the method of detecting SSA/Psmay be used to follow a subject over time to determine when the risk ofprogressing to more severe disease is high thereby requiring treatment.Additionally, the method of detecting SSA/Ps may be used to measure therate of disease progression. For example, an increased level of CLDN1,KIZ, MEGF6, PLA2G16, PTAFR, SBSPON, SEMG1, SLC7A9, SPIRE1, and TACSTD2and decreased level of CHFR, CHGA, and NTRK2 may indicate diseaseprogression. Early assessment of the risk of colorectal cancer in thesubject may reduce the development and/or progression of symptomsassociated with colorectal cancer by enabling improved interventions orenabling earlier interventions. The term “risk” as used herein refers tothe probability that an event will occur over a specific time period,for example, as in the development of colorectal cancer (CRC) and canmean a subject's “absolute” risk or “relative” risk. Absolute risk canbe measured with reference to either actual observation,post-measurement for the relevant time cohort, or with reference toindex values developed from statistically valid historical cohorts thathave been followed for the relevant time period. Relative risk refers tothe ratio of absolute risks of a subject compared either to the absoluterisks of low risk cohorts or an average population risk, which can varydepending on how clinical risk factors are assessed. Odds ratios, theproportion of positive events to negative events for a given testresult, are also commonly used (odds are according to the formulap/(1−p) where p is the probability of event and (1−p) is the probabilityof no event) to no-conversion.

Additionally, a method for monitoring serrated polyps in a subject maybe used to determine the response to treatment. As used herein, subjectswho respond to treatment are said to have benefited from treatment. Forexample, a method of detecting SSA/Ps may be performed on the biologicalsample of the subject prior to initiation of treatment. Then, at a latertime, a method of detecting SSA/Ps may be used to determine the responseto treatment over time. For example, a method of detecting SSA/Ps may beperformed on the biological sample of the same subject days, weeks,months, or years following initiation of treatment. Accordingly, amethod of detecting SSA/Ps may be used to follow a subject receivingtreatment to determine if the subject is responding to treatment. If thelevel of expression of the nucleic acids in a molecular signature of thedisclosure remains the same, then the subject may not be responding totreatment. If the level of expression of the nucleic acids in amolecular signature of the disclosure changes, then the subject may beresponding to treatment. These steps may be repeated to determine theresponse to therapy over time.

In any of the foregoing embodiments, the subject may or may not bediagnosed with serrated polyps or SSA/Ps. In certain embodiments, thesubject may not be diagnosed with serrated polyps or SSA/Ps but issuspected of having serrated polyps or SSA/Ps based on symptoms.Non-limiting examples of symptoms of serrated polyps or SSA/Ps that maylead to a diagnosis include bleeding and iron deficiency anemia. Inother embodiments, the subject may not be diagnosed with serrated polypsor SSA/Ps but is at risk of having serrated polyps or SSA/Ps.Non-limiting examples of risk factors for serrated polyps or SSA/Psinclude smoking, diabetes, obesity, age, sex, diet, and family history.In other embodiment, the subject has no symptoms and/or no risk factorsfor serrated polyps or SSA/Ps. Methods of diagnosing serrated polyps orSSA/Ps are known in the art. Non-limiting examples of methods ofdiagnosing serrated polyps or SSA/Ps include histological pathology.

Suitable subjects include, but are not limited to, a human, a livestockanimal, a companion animal, a lab animal, and a zoological animal. Inone embodiment, the subject may be a rodent, e.g. a mouse, a rat, aguinea pig, etc. In another embodiment, the subject may be a livestockanimal. Non-limiting examples of suitable livestock animals may includepigs, cows, horses, goats, sheep, llamas, and alpacas. In yet anotherembodiment, the subject may be a companion animal. Non-limiting examplesof companion animals may include pets such as dogs, cats, rabbits, andbirds. In yet another embodiment, the subject may be a zoologicalanimal. As used herein, a “zoological animal” refers to an animal thatmay be found in a zoo. Such animals may include non-human primates,large cats, wolves, and bears. In an embodiment, the animal is alaboratory animal. Non-limiting examples of a laboratory animal mayinclude rodents, canines, felines, and non-human primates. In certainembodiments, the animal is a rodent. In a preferred embodiment, thesubject is human.

(a) Biological Sample

As used herein, the term “biological sample” refers to a sample obtainedfrom a subject. Any biological sample which may be assayed for nucleicacid expression products may be used. Numerous types of biologicalsamples are known in the art. Suitable biological sample may include,but are not limited to, tissue samples or bodily fluids. In someembodiments, the biological sample is a tissue sample such as a tissuebiopsy from the gastrointestinal tract. The biopsy may be taken during acolonoscopy, prior to surgical resection, during surgical resection orfollowing surgical resection. The biopsied tissue may be fixed, embeddedin paraffin or plastic, and sectioned, or the biopsied tissue may befrozen and cryosectioned. In an embodiment, the biological sample is aformalin-fixed paraffin-embedded (FFPE) tissue sample. Alternatively,the biopsied tissue may be processed into individual cells or anexplant, or processed into a homogenate, a cell extract, a membranousfraction, or a protein extract. In a specific embodiment, the biopsiedtissue is from a colorectal polyp. In other embodiments, the sample maybe a bodily fluid. Non-limiting examples of suitable bodily fluidsinclude blood, plasma, serum, or feces. The fluid may be used “as is”,the cellular components may be isolated from the fluid, or a proteinfraction may be isolated from the fluid using standard techniques.

As will be appreciated by a skilled artisan, the method of collecting abiological sample can and will vary depending upon the nature of thebiological sample and the type of analysis to be performed. Any of avariety of methods generally known in the art may be utilized to collecta biological sample. Generally speaking, the method preferably maintainsthe integrity of the sample such that the nucleic acids of a molecularsignature of the disclosure can be accurately detected and the level ofexpression measured according to the disclosure.

In some embodiments, a single sample is obtained from a subject todetect the molecular signature in the sample. Alternatively, themolecular signature may be detected in samples obtained over time from asubject. As such, more than one sample may be collected from a subjectover time. For instance, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, or more samples may be collected from a subject over time. In someembodiments, 2, 3, 4, 5, or 6 samples are collected from a subject overtime. In other embodiments, 6, 7, 8, 9, or 10 samples are collected froma subject over time. In yet other embodiments, 10, 11, 12, 13, or 14samples are collected from a subject over time. In other embodiments,14, 15, 16, or more samples are collected from a subject over time.

When more than one sample is collected from a subject over time, samplesmay be collected every 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or moredays. In some embodiments, samples are collected every 1, 2, 3, 4, or 5days. In other embodiments, samples are collected every 5, 6, 7, 8, or 9days. In yet other embodiments, samples are collected every 9, 10, 11,12, or more days. In still other embodiments, samples are collected amonth apart, 3 months apart, 6 months apart, 1 year apart, 2 yearsapart, 5 years apart, 10 years apart, or more.

(b) Determining the Level of Nucleic Acid Expression

Once a sample is obtained, it is processed in vitro to detect andmeasure the level of expression of the nucleic acids in a molecularsignature of the disclosure. Methods for assessing the level of nucleicacid expression are well known in the art and all suitable methods fordetecting and measuring the level of expression of nucleic acids knownto one of skill in the art are contemplated within the scope of theinvention. The term “amount of nucleic acid expression” or “level ofnucleic acid expression” or “expression level” as used herein refers toa measurable level of expression of the nucleic acids, such as, withoutlimitation, the level of messenger RNA transcript expressed or aspecific exon or other portion of a transcript, the level of proteins orportions thereof expressed from the nucleic acids, the number orpresence of DNA polymorphisms of the nucleic acids, the enzymatic orother activities of the proteins codec by the nucleic acids, and thelevel of a specific metabolite. The term “nucleic acid” includes DNA andRNA and can be either double stranded or single stranded. In a specificembodiment, determining the level of expression of a nucleic acid of themolecular signature comprises, in part, measuring the level of RNAexpression. The term “RNA” includes mRNA transcripts, and/or specificspliced or other alternative variants of mRNA, including anti-senseproducts. The term “RNA product of the nucleic acid” as used hereinrefers to RNA transcripts transcribed from the nucleic acids and/orspecific spliced or alternative variants. Non-limiting examples ofsuitable methods to assess a level of nucleic acid expression mayinclude arrays, such as microarrays, RNA-seq, PCR, such as RT-PCR(including quantitative RT-PCR), nuclease protection assays and Northernblot analyses. In an embodiment, the method to assess the level ofnucleic acid expression is microarray, RNA-seq or real-time qPCR.

In one embodiment, the level of nucleic acid expression may bedetermined by using an array, such as a microarray. Methods of using anucleic acid microarray are well and widely known in the art. Forexample, a plurality of nucleic acid probes that are complementary orhybridizable to an expression product of each nucleic acid of themolecular signature are used on the array. Accordingly, 3 to 10, 10 to20, 20 to 30, 30 to 50, 50 to 100, 100 to 200, 200 to 300, 300 to 400,and more than 400 nucleic acids may be used on the array. The term“hybridize” or “hybridizable” refers to the sequence specificnon-covalent binding interaction with a complementary nucleic acid. In apreferred embodiment, the hybridization is under high stringencyconditions. Appropriate stringency conditions which promotehybridization are known to those skilled in the art, or can be found inCurrent Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989),6.3.1 6.3.6. The term “probe” as used herein refers to a nucleic acidsequence that will hybridize to a nucleic acid target sequence. In oneexample, the probe hybridizes to an RNA product of the nucleic acid or anucleic acid sequence complementary thereof. The length of probe dependson the hybridization conditions and the sequences of the probe andnucleic acid target sequence. In one embodiment, the probe is at least8, 10, 15, 20, 25, 50, 75, 100, 150, 200, 250, 400, 500, or morenucleotides in length.

In another embodiment, the level of nucleic acid expression may bedetermined using PCR. Methods of PCR are well and widely known in theart, and may include quantitative PCR, semi-quantitative PCR, multiplexPCR, or any combination thereof. Specifically, the level of nucleic acidexpression may be determined using quantitative RT-PCR. Methods ofperforming quantitative RT-PCR are common in the art. In such anembodiment, the primers used for quantitative RT-PCR may comprise aforward and reverse primer for a target gene. The term “primer” as usedherein refers to a nucleic acid sequence, whether occurring naturally asin a purified restriction digest or produced synthetically, which iscapable of acting as a point of synthesis when placed under conditionsin which synthesis of a primer extension product, which is complementaryto a nucleic acid strand is induced (e.g. in the presence of nucleotidesand an inducing agent such as DNA polymerase and at a suitabletemperature and pH). The primer must be sufficiently long to prime thesynthesis of the desired extension product in the presence of theinducing agent. The exact length of the primer will depend upon factors,including temperature, sequences of the primer and the methods used. Aprimer typically contains 15-25 or more nucleotides, although it cancontain less or more. The factors involved in determining theappropriate length of primer are readily known to one of ordinary skillin the art.

The level of nucleic acid expression may be measured by measuring anentire mRNA transcript for a nucleic acid sequence, or measuring aportion of the mRNA transcript for a nucleic acid sequence. Forinstance, if a nucleic acid array is utilized to measure the amount ofmRNA expression, the array may comprise a probe for a portion of themRNA of the nucleic acid sequence of interest, or the array may comprisea probe for the full mRNA of the nucleic acid sequence of interest.Similarly, in a PCR reaction, the primers may be designed to amplify theentire cDNA sequence of the nucleic acid sequence of interest, or aportion of the cDNA sequence. One of skill in the art will recognizethat there is more than one set of primers that may be used to amplifyeither the entire cDNA or a portion of the cDNA for a nucleic acidsequence of interest. Methods of designing primers are known in the art.Methods of extracting RNA from a biological sample are known in the art.

The level of expression may or may not be normalized to the level of acontrol nucleic acid. This allows comparisons between assays that areperformed on different occasions.

(c) Comparing the Level of Nucleic Acid Expression and Detecting SSA/Ps

The level of expression of each nucleic acid of the molecular signaturemay be compared to a reference expression level for each nucleic acid ofthe molecular signature. The subject expression levels of the nucleicacids in the molecular signature in a biological sample are compared tothe corresponding reference expression levels of the nucleic acids ofthe molecular signature to detect SSA/Ps. Accordingly, a referenceexpression level may comprise 3 to 10, 10 to 20, 20 to 30, 30 to 50, 50to 100, 100 to 200, 200 to 300, 300 to 400, and more than 400 expressionlevels based on the number of nucleic acids in the molecular signature.Any suitable reference value known in the art may be used. For example,a suitable reference value may be the level of molecular signature in abiological sample obtained from a subject or group of subjects of thesame species that have no signs or symptoms of disease (i.e. serratedpolyps). In another example, a suitable reference value may be the levelof molecular signature in a biological sample obtained from a subject orgroup of subjects of the same species that have not been diagnosed withdisease (i.e. serrated polyps). In still another example, a suitablereference value may be the level of molecular signature in a biologicalsample obtained from a subject or group of subjects of the same speciesthat have been diagnosed with SSA/Ps. In yet still another example, asuitable reference value may be the level of molecular signature in abiological sample obtained from a subject or group of subjects of thesame species that been diagnosed with HPs. In a different example, asuitable reference value may be the background signal of the assay asdetermined by methods known in the art. In another different example, asuitable reference value may be the level of molecular signature in anon-diseased or HP sample stored on a computer readable medium. In stillanother different example, a suitable reference value may be the levelof molecular signature in a SSA/Ps sample stored on a computer readablemedium. Common forms of computer-readable media include, for example, afloppy disk, a flexible disk, hard disk, magnetic tape, or othermagnetic medium, a CD-ROM, CDRW, DVD, or other optical medium, punchcards, paper tape, optical mark sheets, or other physical medium withpatterns of holes or other optically recognizable indicia, a RAM, aPROM, and EPROM, a FLASH-EPROM, or other memory chip or cartridge, acarrier wave, or other medium from which a computer can read.

In other examples, a suitable reference value may be the level of themolecular signature in a reference sample obtained from the samesubject. The reference sample may or may not have been obtained from thesubject when serrated polyps or SSA/Ps were not suspected. A skilledartisan will appreciate that that is not always possible or desirable toobtain a reference sample from a subject when the subject is otherwisehealthy. For example, in an acute setting, a reference sample may be thefirst sample obtained from the subject at presentation. In anotherexample, when monitoring effectiveness of a therapy, a reference samplemay be a sample obtained from a subject before therapy began. In aspecific embodiment, a reference value may be the level of expression ofeach nucleic acid of the molecular signature in a non-diseased portionof the subject. Such a reference expression level may be used to createa control value that is used in testing diseased samples from thesubject.

The expression level of each nucleic acid of the molecular signature iscompared to the reference expression level of each nucleic acid of themolecular signature to determine if the nucleic acids of the molecularsignature in the test sample are differentially expressed relative tothe reference expression level of the corresponding nucleic acid. Theterm “differentially expressed” or “differential expression” as usedherein refers to a difference in the level of expression of the nucleicacids that can be assayed by measuring the level of expression of theproducts of the nucleic acids, such as the difference in level ofmessenger RNA transcript or a portion thereof expression or of proteinsexpressed of the nucleic acids.

The term “difference in the level of expression” refers to an increaseor decrease in the measurable expression levels of a given nucleic acid,for example as measured by the amount of messenger RNA transcript and/orthe amount of protein in a biological sample as compared with themeasureable expression level of a given nucleic acid in a referencesample (i.e. non-diseased or HP sample). In one embodiment, thedifferential expression can be compared using the ratio of the level ofexpression of a given nucleic acid or nucleic acids as compared with theexpression level of the given nucleic acid or nucleic acids of areference sample, wherein the ratio is not equal to 1.0. For example, anRNA or protein is differentially expressed if the ratio of the level ofexpression of a first sample as compared with a second sample is greaterthan or less than 1.0. For example, a ratio of greater than 1, 1.2, 1.5,1.7, 2, 3, 4, 5, 10, 15, 20 or more, or a ratio less than 1, 0.8, 0.6,0.4, 0.2, 0.1, 0.05, 0.001, or less. In another embodiment, thedifferential expression is measured using p-value. For instance, whenusing p-value, a nucleic acid is identified as being differentiallyexpressed between a first sample and a second sample when the p-value isless than 0.1, preferably less than 0.05, more preferably less than0.01, even more preferably less than 0.005, the most preferably lessthan 0.0001.

Depending on the sample used for reference expression levels, thedifference in the level of expression may or may not be statisticallysignificant. For example, if the sample used for reference expressionlevels is from a subject or subjects diagnosed with SSA/Ps, then whenthe difference in the level of expression is not significantlydifferent, the subject has SSA/Ps. However, when the difference in thelevel of expression is significantly different, the subject has HPs.Alternatively, if the sample used for reference expression levels isfrom a subject or subjects diagnosed with no disease or HP, then whenthe difference in the level of expression is not significantlydifferent, the subject does not have SSA/Ps. However, when thedifference in the level of expression is significantly different, thesubject has SSA/Ps.

(d) Treatment

The determination of SSA/Ps may be used to select treatment forsubjects. As explained herein, a molecular signature disclosed hereincan classify a subject as having HPs or SSA/Ps and into groups thatmight benefit from more aggressive therapy or determine the appropriatetreatment for the subject. In an embodiment, a subject classified ashaving SSA/Ps may be treated. A skilled artisan would be able todetermine standard treatment for SSA/Ps. Accordingly, the methodsdisclosed herein may be used to select treatment for serrated polypsubjects. In an embodiment, the subject is treated based on the level ofexpression of the nucleic acids in a molecular signature of thedisclosure measured in the sample. This classification may be used toidentify groups that are in need of treatment or not or in need of moreaggressive treatment. The term “treatment” or “therapy” as used hereinmeans any treatment suitable for the treatment of SSA/Ps. Treatment mayconsist of standard treatments for SSA/Ps. Non-limiting examples ofstandard treatment for SSA/Ps include increased surveillance,polypectomy, endoscopic resection, and surgical resection. Additionally,the treatment decision may be made based on evidence of progression fromSSA/Ps to cancer.

III. Kit

In an aspect, there is provided a kit to differentiate SSA/Ps and HPs ina subject, comprising detection agents that can detect the expressionproducts of a molecular signature of the disclosure, and instructionsfor use. The kit may further comprise one or more nucleic acids used asa normalization control. The kit may comprise detection agents that candetect the expression products of 3 to 10, 10 to 20, 20 to 30, 30 to 50,50 to 100, 100 to 200, 200 to 300, 300 to 400, and more than 400 nucleicacids described herein.

In another aspect, there is provided a kit to select a therapy for asubject with serrated polyps, comprising detection agents that candetect the expression products of a molecular signature of thedisclosure, and instructions for use. The kit may further comprise oneor more nucleic acids used as a normalization control. The kit maycomprise detection agents that can detect the expression products of 3to 10, 10 to 20, 20 to 30, 30 to 50, 50 to 100, 100 to 200, 200 to 300,300 to 400, and more than 400 nucleic acids described herein.

A person skilled in the art will appreciate that a number of detectionagents can be used to determine the expression of the nucleic acids. Forexample, to detect RNA products of the biomarkers, probes, primers,complementary nucleotide sequences or nucleotide sequences thathybridize to the RNA products can be used.

Accordingly, in one embodiment, the detection agents are probes thathybridize to the nucleic acids in the molecular signature. A personskilled in the art will appreciate that the detection agents can belabeled. The label is preferably capable of producing, either directlyor indirectly, a detectable signal. For example, the label may beradio-opaque or a radioisotope, such as ³H, ¹⁴C, ³²P, ³⁵S, ¹²³I, ¹²⁵I,¹³¹I; a fluorescent (fluorophore) or chemiluminescent (chromophore)compound, such as fluorescein isothiocyanate, rhodamine or luciferin; anenzyme, such as alkaline phosphatase, beta-galactosidase or horseradishperoxidase; an imaging agent; or a metal ion.

The kit can also include a control or reference standard and/orinstructions for use thereof. In addition, the kit can include ancillaryagents such as vessels for storing or transporting the detection agentsand/or buffers or stabilizers.

In some embodiments, the kit is a nucleic acid array, a multiplex RNA, achip based array, and the like.

In certain embodiments, the kit is a nucleic acid array. Such an arraymay be used to determine the expression level of the nucleic acids in abiological sample. An array may be comprised of a substrate havingdisposed thereon nucleic acid sequences capable of hybridizing to thenucleic acid sequences of a molecular signature of the disclosure. Forinstance, the array may comprise nucleic acid sequences capable ofhybridizing to 18 nucleic acids selected from the group consisting ofC4BPA, CHGA, CLDN1, CPE, DPP10, GRAMD1B, GRIN2D, KIZ, KLK7, MEGF6, MYCN,NTRK2, PLA2G16, SBSPON, SEMG1, SLC7A9, SPIRE1, and TM4SF4. In anotherembodiment, the array may comprise nucleic acid sequences capable ofhybridizing to 16 nucleic acids selected from the group consisting ofCLDN1, FOXD1, KIZ, MEGF6, NTRK2, PIK3R3, PLA2G16, PRUNE2, PTAFR, SBSPON,SEMG1, SLC7A9, SPIRE1, TACSTD2, TPD52L1, and TRIB2. In still anotherembodiment, the array may comprise nucleic acid sequences capable ofhybridizing to 13 nucleic acids selected from the group consisting ofCHFR, CHGA, CLDN1, KIZ, MEGF6, NTRK2, PLA2G16, PTAFR, SBSPON, SEMG1,SLC7A9, SPIRE1, and TACSTD2.

In certain embodiments, the kit is a chip based array. Such an array maybe used to determine the expression level of the proteins in abiological sample. The proteins may be the transcription products fromthe nucleic acid sequences disclosed herein.

A person skilled in the art will appreciate that a number of detectionagents can be used to determine the expression level of thetranscription products of the nucleic acid sequences disclosed herein.

Several substrates suitable for the construction of arrays are known inthe art. The substrate may be a material that may be modified to containdiscrete individual sites appropriate for the attachment or associationof the nucleic acid and is amenable to at least one detection method.Alternatively, the substrate may be a material that may be modified forthe bulk attachment or association of the nucleic acid and is amenableto at least one detection method. Non-limiting examples of substratematerials include glass, modified or functionalized glass, plastics(including acrylics, polystyrene and copolymers of styrene and othermaterials, polypropylene, polyethylene, polybutylene, polyurethanes,TeflonJ, etc.), nylon or nitrocellulose, polysaccharides, nylon, resins,silica or silica-based materials including silicon and modified silicon,carbon, metals, inorganic glasses and plastics. In an embodiment, thesubstrates may allow optical detection without appreciably fluorescing.

A substrate may be planar, a substrate may be a well, i.e. a 1534-,384-, or 96-well plate, or alternatively, a substrate may be a bead.Additionally, the substrate may be the inner surface of a tube forflow-through sample analysis to minimize sample volume. Similarly, thesubstrate may be flexible, such as a flexible foam, including closedcell foams made of particular plastics. Other suitable substrates areknown in the art.

The nucleic acid or biomolecules may be attached to the substrate in awide variety of ways, as will be appreciated by those in the art. Thenucleic acid may either be synthesized first, with subsequent attachmentto the substrate, or may be directly synthesized on the substrate. Thesubstrate and the nucleic acid may both be derivatized with chemicalfunctional groups for subsequent attachment of the two. For example, thesubstrate may be derivatized with a chemical functional group including,but not limited to, amino groups, carboxyl groups, oxo groups or thiolgroups. Using these functional groups, the nucleic acid may be attachedusing functional groups on the biomolecule either directly or indirectlyusing linkers.

The nucleic acid may also be attached to the substrate non-covalently.For example, a biotinylated nucleic acid can be prepared, which may bindto surfaces covalently coated with streptavidin, resulting inattachment. Alternatively, a nucleic acid or nucleic acids may besynthesized on the surface using techniques such as photopolymerizationand photolithography. Additional methods of attaching biomolecules toarrays and methods of synthesizing biomolecules on substrates are wellknown in the art, i.e. VLSIPS technology from Affymetrix (e.g., see U.S.Pat. No. 6,566,495, and Rockett and Dix, Xenobiotica 30(2):155-177, eachof which is hereby incorporated by reference in its entirety).

In one embodiment, the nucleic acid or nucleic acids attached to thesubstrate are located at a spatially defined address of the array.Arrays may comprise from about 1 to about several hundred thousandaddresses. A nucleic acid may be represented more than once on a givenarray. In other words, more than one address of an array may becomprised of the same nucleic acid. In some embodiments, two, three, ormore than three addresses of the array may be comprised of the samenucleic acid. In certain embodiments, the array may comprise controlnucleic acids and/or control addresses. The controls may be internalcontrols, positive controls, negative controls, or background controls.

Furthermore, the nucleic acids used for the array may be labeled. Oneskilled in the art understands that the type of label selected dependsin part on how the array is being used. Suitable labels may includefluorescent labels, chromagraphic labels, chemi-luminescent labels, FRETlabels, etc. Such labels are well known in the art.

As various changes could be made in the above compounds, products andmethods without departing from the scope of the invention, it isintended that all matter contained in the above description and in theexamples given below, shall be interpreted as illustrative and not in alimiting sense.

EXAMPLES

The following examples are included to demonstrate various embodimentsof the present disclosure. It should be appreciated by those of skill inthe art that the techniques disclosed in the examples that followrepresent techniques discovered by the inventors to function well in thepractice of the invention, and thus can be considered to constitutepreferred modes for its practice. However, those of skill in the artshould, in light of the present disclosure, appreciate that many changescan be made in the specific embodiments which are disclosed and stillobtain a like or similar result without departing from the spirit andscope of the invention.

Introduction.

Screening programs have resulted in significant reduction of colorectalcancer (CRC) related deaths. Key to the improvement of clinical outcomesis the appropriate follow-up using colonoscopy and removal ofpremalignant polyps. However, different types of colonic polyps havedifferent malignant potentials and recommendations for removal andfollow-up vary depending on their type. The most common polyps includethe conventional adenomas and serrated polyps, and until approximately1996 the hyperplastic polyp was the only recognized type of serratedpolyp. The term sessile serrated adenoma/polyp was introduced to defineserrated lesions which were generally considered to be preneoplastic,usually lack cytological dysplasia and have been reported in 5% ofaverage-risk patients undergoing screening colonoscopy. Currently,serrated polyps are divided into three main categories: typicalhyperplastic polyps (HPs), sessile serrated adenoma polyps (SSA/Ps) andtraditional serrated adenomas (relatively rare). However, SSA/Ps and HPsshare significant histological similarities, as serrated cryptarchitecture is the principal microscopic feature in both polyps.Dilated or boot-shaped crypt bases are diagnostic features of SSA/Ps. Ingeneral, SSA/Ps are larger than HPs and are more commonly located inproximal (right) colon. However, given the significant histologicoverlap between the two polyp types, biopsy specimens are frequentlyequivocal in cases lacking the diagnostic hallmarks of SSA/Ps. Inaddition, several studies have pointed out significantobserver-to-observer variability, even among expert pathologists.Because SSA/Ps have the potential to progress into colon cancer,reliable biomarkers that aid in this differential diagnosis are needed.It is estimated that SSA/Ps account for 15-30% of colon cancers byprogression through the serrated neoplasia pathway. However, thispathway remains relatively uncharacterized as compared to theadenoma-carcinoma pathway. Genetic and epigenetic mechanisms operatingin the serrated pathway can include BRAF mutations, KRAS mutations, CpGisland methylator high (CIMP-H) and microsatellite instability high(MSI-H) phenotypes which often predict a poor clinical outcome. However,the serrated neoplasia pathway remains to be defined by a characteristicset of genetic and epigenetic lesions.

Since the advent of high-throughput gene expression technologies(microarrays, RNA sequencing) molecular signatures that accuratelydiagnose or predict disease outcome based on expression of sets of geneshave been developed. In many cases gene expression signatures can beassociated with biological mechanisms, subtypes of cancer that lookhistologically similar, tumor stages, as well as the ability tometastasize, relapse or respond to specific therapies. Expression-basedclassifiers were also developed to identify patients with a poorprognosis for stage II colon cancers. Recently, a subgroup of coloncancers with a very poor prognosis was identified and this subgroup hasseveral up-regulated pathways in common with sessile serrated adenomas.However, there is no molecular classifier, differentiating betweenSSA/Ps and HPs.

Several recent studies used transcriptome analyses to gain insights intothe biology of SSA/Ps. For example, in a gene array study SSA/Ps werecompared to tubular adenomas (TAs) and control samples. Among 67differentially expressed (DE) genes the two most up-regulated genes(Cathepsin E and Trefoil Factor 1) were verified in QRT-PCR andimmunohistochemistry experiments that showed that these genes wereoverexpressed in SSA/Ps. In another gene array study 162 DE genes wereidentified in SSA/Ps as compared to microvesicular hyperplastic polyps(MVHP, HP subtype). Validation by QRT-PCR and immunohistochemistryidentified annexin A10 as a potential diagnostic marker of SSA/Ps.Another study used RNA sequencing (RNA-seq) to analyze the SSA/Ptranscriptomes and identified 1,294 genes, differentially expressed inSSA/Ps as compared to HPs. This analysis provided evidence thatmolecular pathways involved in colonic mucosal integrity and celladhesion were overrepresented in SSA/Ps.

The goals of this study were two-fold. First, to gain insights into thebiological processes underlying the differences between SSA/Ps and HPs.Data from HPs and SSA/Ps matched with control samples was analyzed.Importantly, the right and left colon have a different embryologicalorigin and it was shown that more than 1,000 genes are differentiallyexpressed in adult right versus left colon. SSA/Ps occur predominantlyin the right colon and HPs occur predominantly in the left colon.Consequently, some genes that are DE between SSA/Ps and HPs are likelyto be due to their different anatomical location (right versus left).Therefore, to find genes and pathways that are DE specifically betweenSSA/Ps and HPs, it is first necessary to exclude genes that are DEbetween the right and left colon. As such, in addition to SSA/Ps andHPs, control samples obtained from the right colon (CR) and left colon(CL) were also included in the study. The analysis of differentiallyexpressed genes and pathways revealed several differentially expressedand differentially co-expressed pathways between SSA/Ps and HP, CRsamples. The pathways found here are generally considered hallmarks ofcancer: they were associated with the ability to escape apoptoticsignals, the inflammatory state of premalignant lesions and uncontrolledproliferation.

Second, to develop an expression-based classifier that reliablydifferentiates between HPs and SSA/Ps and is platform-independent (itworks for RNA-seq as well as for microarrays). For that independentmicroarray data sets were collected: an Illumina gene array data set(six HPs and six SSA/Ps) and subsets of samples from two Affymetrix datasets (eleven HPs from GSE10714 and six SSA/Ps from GSE45270). Typically,the most ambiguous step in classifier development is the step of featureselection because of the ‘large p small n’ problem of omics data. Omicsdata have at most only hundreds of samples (n) and thousands of features(p), and using all features will lead to model over-fitting and poorgeneralizability. Feature selection techniques differ in the way theycombine feature selection with the construction of the classificationmodel and usually are classified into three categories: filter, wrapper,and embedded algorithms. Filter algorithms preselect features beforeusing classifier based, for example, on the results of significancetesting. Wrapper algorithms combine the search of optimal features withthe model selection and evaluate features by training and testingclassification model. For example, the Shrunken Centroid Classifier(SCC) first finds a centroid for each class and selects features toshrink the gene centroid toward the overall class centroid. Here ispresented a new way to combine filter and wrapper algorithms that fittedbest to the goal, i.e. building platform independent classifier. First,the feature space was reduced by selecting only those features (genes)that were concordantly expressed over all three platforms. Second, SCC(using all genes left after filtering) was applied on RNA-seq data forfurther reducing the feature space and selecting features with optimalclassification performance. The classifier, developed based on RNA-seqdata identified SSA/P and HP subtypes in independent microarray datasets with low classification errors. The molecular signature thatcorrectly classifies SSA/Ps and HPs consists of thirteen genes and is afirst platform-independent signature that is applicable as diagnostictool for distinguishing SSA/Ps from HPs. The molecular signatureachieved an impressive correct classification rate (90%) when expressionlevels obtained by real-time quantitative polymerase chain reaction(qPCR) from 45 independent formalin-fixed paraffin-embedded (FFPE) SSA/Pand HP samples were used for validation. These results demonstrate theclinical value of the molecular signature.

Expression Analysis.

Filtering Steps.

Genes were called DE if two conditions were met: |log₂FC|>0.5 andadjusted p-values P_(adj)<0.05 (see Methods for more detail). Theintersections of the three comparisons: (1) Control Right (CR) versusControl Left (CL) samples (CR_CL), (2) HP versus SSA/P samples(HP_SSA/P) and (3) CR versus SSA/P samples (CR_SSA/P) are shown inFIG. 1. There were 1049 genes DE between CR and CL samples, and amongthese genes 157 were also DE between HPs and SSA/Ps and 276 were DEbetween CR and SSA/P samples. There were 121 genes in the intersectionof all three comparisons. With the aim of identifying only genes thatreliably differentiate between HPs and SSA/Ps as well as between SSA/Psand CR samples, the three aforementioned groups were excluded from thefurther study. The following groups were considered for furtheranalysis: (1) 139 genes that were DE between SSA/Ps and both HP and CRsamples (Table 4), (2) 134 genes, exclusively DE between HPs and SSA/Ps(Table 5) and (3) 1058 genes, exclusively DE between CR and SSA/Psamples (Table 6). The 121 genes in the intersection of all threecomparisons (Table 7) were excluded for the sake of rigor, i.e. forconsidering only genes that were DE between different polyp types,without referring to the anatomical location. Although these 121 geneswere excluded here, further investigation is needed to assess theirimportance in differentiating between HPs and SSA/Ps.

FIG. 2 presents PCA plot illustrating the difficulties indifferentiating between SSA/P and HP samples even at the molecularlevel. The two groups are clearly intermingled when all expressed genesare included (FIG. 2A) and the separation is much better when genes DEbetween HPs and SSA/Ps as well as between SSA/Ps and CR samples areincluded with the exclusion of genes DE between CR and CL samples (FIG.2C). Thus, the filtering step allows more detailed characterization ofthe differences between HPs and SSA/Ps (so the better separation).

Characteristic Differences Between SSA/Ps and Other Samples.

To understand more clearly the biological differences between SSA/Ps andother samples, only genes expressed at the same level in HP and CRsamples and significantly up- or down-regulated in SSA/Ps were firstconsidered. At this step only genes satisfying the following conditions:(1) gene expression level (e) satisfied an equation:e=/(CR−HP)//(CR+HP+0.01)<0.1 and (2) gene was significantly DE inCR_SSA/P and HP_SSA/P comparisons were considered.

There were only five genes down-regulated in SSA/Ps and expressed at thesame level in HPs and CRs (FIG. 3). Two of them regulate celldifferentiation and proliferation: NEUROD1 (neuronal differentiation 1)is involved in enteroendocrine cell differentiation and CHFR (checkpointwith forkhead associated and RING Finger) is an early mitotic checkpointregulator that delays transition to metaphase in response to mitoticstress. CHFR has been found to be frequently inactivated in manymalignancies by promoter methylation, in particular, in microsatellitestable and BRAF wild-type CRCs stage II. NEU4, another down-regulatedgene, maintains normal mucosa and its down-regulation was suggested tocontribute to invasive properties of colon cancers. Other down-regulatedgenes are RASL11A (regulates translation and transcription) and WSCD1(WSC domain containing 1, poorly characterized).

Twenty out of thirty genes up-regulated in SSA/Ps and expressed at thesame level in CR and HP samples, were found to be interferon-regulated(IR). In addition to modulating innate immune response, interferonsregulate a large variety of cellular functions, such as cellproliferation, differentiation, as well as play important roles ininflammatory diseases and anti-tumor response. These twenty genes wererepresented by (1) genes, involved in the epithelial-mesenchymaltransition (EMT): PIK3R3, RAB27B, and MSX2; (2) classical IR genes:GBP2, CFB, TRIB2, TBX3, OAS2, IFIT3, XAF1, MX1, IDO1, CXCL9, CXCL10,GBP1, CCL22, CCL2; (3) genes, not conventionally considered IR: RAMP1,PARP14, and TPD52L1.

Among these twenty genes there were three especially interesting in thecontext of SSA/Ps progression toward cancer. Indoleamine 2,3-dioxygenase1 (IDO1) has attracted considerable attention recently because of itsimmune-modulatory role besides the degradation of tryptophan. IDOregulates T cell activity by reducing the local concentration oftryptophan and increasing the production of its metabolites thatsuppress T lymphocytes proliferation and induce apoptosis. Because mosthuman tumors constitutively express IDO, the idea that IDO inhibitorsmay reverse immune suppression, associated with tumor growth, is veryattractive for immunotherapy and a competitive inhibitor for IDO (I-mT)is currently in clinical trials. 001 was 2.7 times up-regulated inSSA/Ps as compared to HP, CR samples. PIK3R3, an isoform of class IAphosphoinositide 3-kinase (PI3K), that specifically interacts with cellproliferation regulators and promotes metastasis and EMT in colorectalcancer, was also up-regulated in SSA/Ps. PARP14 promotes aerobicglycolysis or the Warburg effect, used by the majority of tumor cells,by inhibiting pro-apoptotic kinase JNK1. Immunosuppressive state, theshift toward aerobic glycolysis and the EMT, are all considered themajor hallmarks of cancer. While these three genes are onlyinfinitesimal parts of the invasive cascades, their up-regulation pointstoward how SSA/Ps may progress to cancer.

Several IR genes reported here have been also found to be up-regulatedin a number of malignancies (including CRCs). For example, RAB27B wasexpressed at a high level and is a special member of the small GTPaseRab family regulating exocytosis which has been associated with a poorprognosis in patients with CRC. Increased expression of RAB27B has beenshown to predict a poor outcome in patients with breast cancer. Thesuggested mechanism by which Rab27b stimulates invasive tumor growthincludes regulation of the heat shock HSP90a protein and the indirectinduction of MMP-2, a protease that requires an association withextracellular HSP90a for its activity to accelerate the degradation ofextracellular matrix. The transcription factor TBX3 (T-box 3), whichplays an important role in embryonic development, was also up-regulatedin SSA/Ps. Previously it was suggested that TBX3 promotes an invasivecancer phenotype and more recently it was also shown that increasedexpression of TBX3 was associated with a poor prognosis in CRC patients.The transcriptional co-regulator LIM-only protein 4 (LMO4) has beenassociated with poor prognosis and is overexpressed in about 60% of allhuman breast tumors and has been shown to increase cell proliferationand migration. LMO4 was up-regulated in SSA/Ps. Tumor protein D52-likeproteins (TPD52) are small proteins that were first identified in breastcancer, are overexpressed in many other cancers, but remain poorlycharacterized. TPD52L1, member of the family, was upregulated in SSA/Ps.

Besides the twenty IR genes, there were other interesting genesup-regulated in SSA/Ps and expressed at the same level in CR and HPsamples. MUC6 (mucin 6) was the most highly up-regulated gene and hasbeen previously suggested as a candidate biomarker for SSA/Ps but laterwas found to be not specific enough to reliably differentiate SSA/Psform HPs. KIZ (kizuna centrosomal protein) is a gene that is criticalfor the establishment of robust mitotic centrosome architecture andproper chromosome segregation at mitosis. While depletion of KIZ resultsin multipolar spindles, how up-regulation of KIZ affects mitosis isunknown. SPIRE1, an actin organizer, was recently found to contribute toinvadosome functions by speeding up extracellular matrix lysis whileoverexpressed.

One of the limitations of studying differentially expressed genes onegene at a time is that it does not allow a systems-level view of globalchanges in expression and co-expression patterns between phenotypes.Thus, the inventors sought to identify all pathways that weresignificantly up- or down-regulated, as well as differentiallyco-expressed between SSA/Ps and HP, CR samples. Pathways were presentedby all gene ontology (GO) terms from C5 collection of gene sets inMSigDB.

Pathways, Differentially Expressed Between SSA/Ps and HP, CR Samples.

To find pathways, significantly up- or down-regulated ROAST, aparametric multivariate rotation gene set test, was applied. ROAST usesthe framework of linear models and tests whether for all genes in apathway, a particular contrast of the coefficients is non-zero. It canaccount for correlations between genes and has the flexibility of usingdifferent alternative hypotheses, testing whether the direction ofchanges for a gene in a pathway is up, down or mixed (up or down). Onlypathways where genes were significantly up- or down-regulated (FDR<0.05)were selected. There were fifteen pathways, significantly up-regulatedin SSA/Ps as compared to HP, CR samples (Table 1). In agreement with thepattern found for individual genes, two out of the fifteen pathways were‘Inflammatory response’ and ‘Immunological synapse’ (Table 1). GO term‘Extracellular structure organization and biogenesis’ overlaps with twoKEGG pathways: ‘KEGG focal adhesion’ and ‘KEGG ECM receptorinteraction’. Overexpression of these pathways as well as ‘Celladhesion’ (two pathways) category might indicate changes in cellmotility and migration ability in SSA/Ps phenotype as compared to HP, CRsamples. Up-regulation of ‘Cell growth and death’ (two pathways)category suggests increased cellular proliferation in SSA/Ps phenotype.

There was only one pathway down-regulated in SSA/Ps as compared to HP,CR samples, namely ‘Transmembrane receptor protein serine threoninekinase signaling pathways’ (FDR<0.05). The pathway generates a series ofmolecular signals as a consequence of a transmembrane receptorserine/threonine kinase binding to its ligand and regulates fundamentalcell processes such as proliferation, differentiation, death,cytoskeletal organization, adhesion and migration. For this pathway, oneof the most significantly down-regulated genes was HIPK2 (homeodomaininteracting protein kinase 2). HIPK2 interacts with many transcriptionfactors including p53 and is a tumor suppressor that regulatescell-cycle checkpoint activation and apoptosis. Therefore, itsdown-regulation may contribute to up-regulation of ‘Positive regulationof cell proliferation’ pathway. However, given that Transmembranereceptor protein serine threonine kinase signaling pathways' regulatesmany fundamental cellular processes, its main downstream targets in thecase of SSA/Ps require further study.

Pathways, Differentially Co-Expressed Between SSA/Ps and HP, Cr Samples.

To find pathways that were differentially co-expressed, an approach thatassesses multivariate changes in the gene co-expression network betweentwo conditions, the Gene Sets Net Correlations Analysis (GSNCA), wasapplied. GSNCA tests the hypothesis that the co-expression network of apathway did not change between two conditions. In addition, for eachcondition it builds a core of co-expression network, using the mosthighly correlated genes, and finds a ‘hub’ gene, defined as the one,with the highest correlations with the other genes in a pathway (seeRahmatallah et al., Bioinformatics 2014; 30(3): 360-8, the disclosure ofwhich is hereby incorporated by reference in its entirety, for moredetail). In other words, hub genes are the most ‘influential’ genes in apathway. When hub genes in a pathway are different between phenotypes,it points toward regulatory changes in a pathway dynamic.

There were seven pathways significantly differentially co-expressedbetween SSA/Ps and CR, HP samples (P<0.05). Five out of seven werepathways regulating homologous and non-homologous recombination, DNAreplication, GTPase activities and proteins targeting towards a membraneusing signals contained within the protein (FIG. 7, FIG. 8, FIG. 9, FIG.10, and FIG. 11). For all five pathways, hub genes were differentbetween HPs and SSA/Ps, with a shift in SSA/Ps toward hub genes relatedto genomic instability. For example, for ‘Meiosis I’ and ‘Meioticrecombination’ pathways, hub genes were RAD51 and MRE11A in HPs andSSA/Ps, respectively. Both proteins are involved in the homologousrecombination and repair of DNA double strand breaks. MRE11A alsoparticipates in alternative end-joining (A-EJ), an important pathway inthe formation of chromosomal translocations. The shift from RAD51 toMRE11A in SSA/P phenotype might indicate an increased genomicinstability, the key change in all cancer cells.

For ‘Golgi stack’ pathway, the shift of hub genes was associated withthe well-known phenotypic difference between HPs and SSA/Ps (FIG. 4).The hub gene in HP phenotype was RAB14, low molecular mass GTPase thatis involved in intracellular membrane trafficking and cell-celladhesion. The hub gene in SSA/P phenotype was B3GALT6, abeta-1,3-galactosyltransferase, required for glycosaminoglycan(mucopolysaccharides) synthesis, including mucin. The presence ofabundant surface mucin is the conventional colonoscopic characteristicof SSA/Ps. For ‘Hormone activity’ in HP phenotype the hub gene was IGF1,the insulin-like growth factor that promotes cell proliferation andinhibits apoptosis, stimulates glucose transport in cells and enhancesglucose uptake (FIG. 12). In SSA/P phenotype, the hub gene was PYY,encoding a member of the neuropeptide Y (NPY) family of peptides. Thisgut peptide plays important roles in energy and glucose homeostasis, inregulating gastrointestinal motility and absorption of water andelectrolytes and has been associated with several gastrointestinaldiseases. Its role in SSA/P phenotype, if any, remains to be defined.

These cases illustrate the ability of GSNCA to confirm existingknowledge, generate new testable hypotheses and raise interestingquestions. For ‘Golgi stack’ pathway, the shift from RAB14 towardB3GALT6, essential for the mucopolysaccharides synthesis corresponded toknown phenotypic differences between HPs and SSA/Ps. The involvement ofdeficient mismatch repair (dMMR) pathway (that includes MRE11) in CRC iswell documented. Recently, the truncated MRE11 polypeptide was found tobe a significant prognostic marker for long-term survival and responseto treatment of patients with CRC stage III. GSNCA highlighted MRE11A asa new hub gene in ‘Meiosis I’ and ‘Meiotic recombination’ pathways, andit would be worth investigating its mutational status and prognosticpotential in the context of SSA/Ps.

Based on the analysis of individual genes and differentially expressedand co-expressed pathways SSA/Ps difference from HP, CR samplesinvolves: (1) up-regulation of IR genes, EMT genes and genes previouslyassociated with the invasive cancer phenotype; (2) up-regulation ofpathways, implicated in proliferation, inflammation, cell-cell adhesionand down-regulation of serine threonine kinase signaling pathway; and(3) de-regulation of a set of pathways regulating cell division, proteintrafficking and kinase activities.

Given the complexity of the molecular processes underlying SSA/Pphenotype, involving hundreds of differentially expressed genes and manypathways, for the practical purpose of readily distinguishing SSA/Psfrom HPs, the inventors developed a platform-independent molecularclassifier with low classification error rate (see below).

Molecular Classifiers.

Typically, the development of molecular classifiers consists of thefollowing steps: feature selection, model selection, training,estimation of the classification error rate, with every step potentiallyleading to an inflated performance estimate. The systematic errors inclassifier development, such as inappropriate applications ofcross-validation for classifiers' training and testing, are usually thefirst to blame for poor generalizability (high error rate on independentdata sets). Poor generalizability is further emphasized when thetraining and independent test data are obtained using differentplatforms, e.g. different microarray platforms, or microarrays andRNA-seq. To avoid such errors, the inventors developed a new featureselection step identifying the genes, most concordant between differentplatforms. After the new feature selection step was implemented, aclassifier was trained on RNA-seq data and further tested on twoindependent microarray data sets (testing sets, see Methods for moredetails). Identifiers from different platforms were mapped to genesymbols and only genes that were expressed in RNA-seq data and presenton both microarray platforms were considered (Table 8).

Feature Normalization.

For classifier development, 139 genes DE between SSA/Ps and HP, CRsamples (Table 4) were considered. Gene expressions for both RNA-seq andmicroarray platforms were normalized to a common range by subtractingthe median absolute deviation (MAD) from each gene's expression. Hence,gene expressions were centered around zero and genes with large foldchanges between two phenotypes had positive expressions under onephenotype and negative expressions under the other. Genes with the smallvariability were filtered out (MAD<0.1). Finally, only the genesexpressed in all three platforms (117 genes) were considered for furtherclassifier design steps.

Feature Selection Step.

Selecting only genes (features) with high concordance between platformsis crucial to design a platform-independent classifier.Platform-independent classifier, trained using one platform, should havelow classification error rate while being tested using other platform.Here, to assess genes concordance between platforms, a newnon-parametric test was developed (see Methods for details). The testidentified genes, robustly differentiating two phenotypes underdifferent platforms, the best candidates for an inter-platformsignature. Previously, the concordance between platforms has beenmeasured by the correlation between mean expressions or fold changes orby intersection between lists of DE genes.

The idea behind the new test is simple: identify genes with expressionlevels highly correlated between platforms. The practical difficulty ofimplementing the idea is that the numbers of samples, as well as thesamples identities, are different between platforms. Consider twodistributions: (1) correlation coefficients for all genes between twoplatforms, preserving phenotypic labels (0) and (2) correlationcoefficients for all genes between two platforms, randomly resamplingphenotypic labels (ρ_(random)). FIG. 13 presents the distributions ofρ_(true) ρ_(random) when the HP and SSA/P samples from the RNA-seqtraining data were compared with the Illumina and Affymetrix data sets.Some genes had higher correlations when phenotypic labels werepreserved, compared to when they were randomly resampled, introducingnegative skewness to the distribution of ρ_(true) (see FIG. 13). Inother words, these genes correlations between platforms were higher thanby chance, illustrated by the case when phenotypic labels were randomlyresampled. These genes were our candidate concordant genes. Moreformally, to identify concordant genes, the null hypothesis H₀: ρ_(true)≤ρ _(random)+max(SD(ρ_(true)∪ρ_(random))) was tested.

FIG. 5 illustrates how the test works using two examples of typicalMAD-normalized gene expressions in two platforms. In one example, fortyobservations were sampled from two normal distributions N(0.5, 0.25) andN(−0.5, 0.25), representing different phenotypes. In this example, thefold change in both platforms was larger than the within-phenotypevariability (FIG. 5A) and the correlation coefficient between platforms(ρ_(true)) was high. When phenotypic labels were randomly resampled, thefold change in both platforms became negligible as compared to thewithin-phenotype variability (FIG. 5B) and the correlation coefficientbetween platforms (ρ_(random)) became low. In another example, fortyobservations were sampled from two normal distributions N(0.5, 1) andN(−0.5, 1), again representing different phenotypes. However, in thisexample, the fold change in both platforms was smaller than thewithin-phenotype variability (FIG. 5C and FIG. 5D) and the correlationcoefficient between platforms was low when phenotypic labels were eitherpreserved or randomly resampled. Although the fold change betweenphenotypes was the same in both examples (log₂FC=1), Pearson correlationcoefficient between expressions in two platforms preserving phenotypiclabels (ρ_(true)) was higher in case A compared to case C because of thelower within-phenotype variability. Randomly resampling phenotypiclabels led, expectedly, to much lower correlations between two platforms(ρ_(random)) (FIG. 5B and FIG. 5D). Accordingly, ρ_(true)>ρ_(random) inthe first example (FIG. 5A and FIG. 5B) but not in the second (FIG. 5B,FIG. 5D). Taking average correlation between platforms, for a largenumber of iterations, H₀ will be rejected for the first example (FIG. 5Aand FIG. 5B) but not for the second (FIG. 5C and FIG. 5D). The Methodssummarizes the steps of the proposed test.

The test was used to find genes with high concordance between RNA-seqand Illumina platforms (23 genes detected), RNA-seq and Affymetrixplatforms (20 genes detected), and between RNA-seq and both Illumina andAffymetrix platforms (16 genes detected). Only genes, detected by theWilcoxon's test at P<0.05 were considered. The values of the termmax(SD(ρ_(true)∪ρ_(random)) were 0.41 and 0.39 when RNA-seq data werecompared with Illumina and Affymetrix data sets, respectively.

Classifier Design and Gene Signatures.

The model selection step provides a great flexibility because there aremany machine learning algorithms available for classification purposes.The nearest shrunken centroid classifier (SCC) was selected because itwas successfully used before for developing many microarray-basedclassifiers, in particular a prognostic classifier in CRCs. To selectthe threshold value that returns the minimum mean error with the leastnumber of genes, a 3-fold cross-validation was performed over a range ofthreshold values for 100 iterations.

Training the classifier using the RNA-seq data set and considering onlythe genes with high concordance with the Illumina, Affymetrix, and bothplatforms yielded three signatures of 18, 16, and 13 genes (see Table2). The 18 and 16 gene signatures resulted in zero (out of 12 Illuminasamples) and three (out of 17 Affymetrix samples) errors. Classificationerrors did not change when the 13 genes signature was used instead.Hence we considered these 13 genes as the smallest successful signaturefor both Illumina and Affymetrix platforms. The samples in the Illuminadata set were identified as belonging to SSA/Ps or HPs phenotypes bygastrointestinal pathologists based on a higher stringency criterionthan what has been done for the samples in the Affymetrix data set. Itis therefore no surprise that there was less ambiguity in classifyingthe Illumina samples. Although the Illumina samples were acquired by adifferent platform compared to the training RNA-seq data set, they wereclassified without errors. Aside from the stringent criterion inassigning phenotype labels for Illumina samples, this result could bedue to the higher resolution in quantifying gene expression by theRNA-seq platform.

In conclusion, the independent validation (i.e. using differentplatforms) results have shown the feasibility of building molecularclassifiers using RNA-seq training data. Moreover, classifiers builtusing one platform (RNA-seq) were applicable to other platforms(Affymetrix, Illumina) and had low classification error rates inpredicting HP or SSA/P phenotypes as long as only concordant featureswere considered.

Smallest Successful Signature.

The genes included in the smallest signature (13 genes) were on theaverage approximately four folds up-(down-) regulated between SSA/Ps andHPs (Table 3). The average absolute fold change considering all the14006 expressed genes in the RNA-seq training data set was 1.27. Therewere three down- and ten up-regulated genes in SSA/Ps, involved inseveral molecular processes that have been discussed earlier.Down-regulated genes included NTRK2 (neurotrophic tyrosine kinasereceptor, type 2), CHFR (negative regulator of cell cycle checkpoint)and CHGA (chromogranin A, endocrine marker). NTRK2 controls thesignaling cascade that mainly regulates cells growth and survival.

Up-regulated genes included several genes (SLC7A9, SEMG1, SBSPON andMEGF6) that were not well functionally characterized (except SLC7A9, amarker for cystinuria) and are not discussed here. Two genes (KIZ andSPIRE1) were among the genes up-regulated in SSA/Ps and equallydown-regulated in HP, CR samples (FIG. 3). TROP-2 (TACSTD2,tumor-associated calcium signal transducer 2) is a cell-surfacetransmembrane glycoprotein overexpressed in many epithelial tumors.TROP-2 was suggested as a biomarker to determine the clinical prognosisand as a potential therapeutic target in colon cancer and anantibody-drug conjugate targeting TROP-2 is currently in phase IIclinical trials. Claudin-1 (CLDN1, tight junction protein) was alsoup-regulated. Specifically, Claudin-1 has been suggested to be involvedin the regulation of colorectal cancer progression by up-regulatingNotch- and Wnt-signaling and mucosal inflammation. In addition, CLDN1was also associated with liver metastasis of CRC. PLA2G16 phospholipasewas also up-regulated and its up-regulation may be a signal ofgain-of-function activities of mutant p53 that is required formetastasis. Finally, PTAFR, platelet activating factor receptor, wasfound to stimulate EMT by activating STAT3 cascade.

In sum, the up-regulated signature genes included those previouslyassociated with invasive cell activities (CLDN1, PLA2G16, PTAFR,SPIRE1), spindle formation (KIZ) while down-regulated genes includedcheckpoints controlling cell growth (CHFR, NTRK2).

Summary Metric with Class Probability.

The ultimate goal of building a classifier and finding gene signaturesis to use the signature in clinical practice for diagnostic andprognostic purposes. Here, a simple procedure that uses the signaturesin Table 2 was developed to classify new samples as either HP or SSA/Pand provides a class probability for the decision. The mean of theMAD-normalized expression of the genes in the signature was used as asummary metric (SM). Since most of the genes in the signatures in Table2 were over-expressed in SSA/P, SM>0 for SSA/P samples and SM<0 for HPsamples. Before calculating the mean expression, the signs of theexpressions of the few genes that were over-expressed in HP wereinverted. This step increased the magnitude of the mean regardless ofits sign. There were only three genes over-expressed in HP in the13-gene signature (CHFR, CHGA and NTRK2), one in the 16-gene Affymetrixsignature (NTRK2), and four in the 18-gene Illumina signature (CHGA,CPE, DPP10, and NTRK2). The class assignment (HP or SSA/P) dependssimply on the sign of the mean expression.

MAD-normalized gene expressions had approximately Laplace-likedistribution (FIG. 14) and SM distributions were approximately normal(FIG. 15). According to the central-limit theorem, the SM distributionsshould be normal, especially for signatures with a large number of genesp≥30 (FIG. 15). The normal approximation is still valid when thesignature size p<30 if the population is not too different from a normaldistribution. There are several ways of assigning a class probability toa new sample using training RNA-seq data set as a reference. Thedistribution of SM can be estimated by calculating SMs for many randomsignatures of the same size as the signature in use. The probability ofan assigned SSA/P (HP) class is the cumulative distribution functionCDF(SM) (1−CDF(SM)) of the empirical distribution of SM afterstandardization (FIG. 6). Another possibility is to use the normalapproximation of SM (FIG. 6). The first approach is impaired by thepossible differences in the distribution of SM between differentplatforms. For example, applying MAD normalization to the log_(e)-scaleFPKM RNA-seq data yielded SM with negative tail that extended beyond thecorresponding tail in microarray data (FIG. 15). The second approach isimpaired by deviation from normality especially for very smallsignatures. Generally, the distribution of SM was normal-like withhigher kurtosis for small signatures. While the distribution of SM hadkurtosis ≈8 and 4 for RNA-seq and microarray data, respectively (using15 genes in a signature), the kurtosis of a standard normal distributionis 3.

Due to the potential difficulties in fitting an exact distribution to SManother solution was found. A lower bound for P(X≥SM) as the probabilityfor an assigned SSA/P class and P(X≤−SM) as the probability for anassigned HP can be estimated using Cantelli's inequality (also known asone-sided Tchebycheff's inequality). Cantelli's inequality estimates anupper bound for the probability that observations from some distributionare bigger than or smaller than their average:

${{P\left( {{X - \mu} \leq a} \right)} = {{{CDF}\left( {\mu + a} \right)} \geq {1 - \frac{\sigma^{2}}{\sigma^{2} + a^{2}}}}},{a \geq 0}$${{P\left( {{X - \mu} \leq a} \right)} = {{{CDF}\left( {\mu + a} \right)} \leq \frac{\sigma^{2}}{\sigma^{2} + a^{2}}}},{a < 0}$We either choose a=SM and σ=0.14 (which happened to be a standarddeviation of SM in all three platforms when the number of genes is 15),or choose a=standardized SM and σ=1. FIG. 6 presents Cantelli lowerbound (CLB) SSA/P (HP) probabilities. When SM∈[−σ, σ] (orSM_(standardized) ∈[−1,1]) the probability of class assignment is zerofor one class and <50% for the other, therefore no probability wasassigned (Uncertain zone, FIG. 6). To avoid false positive theprobability was assigned if and only if Cantelli lower bound of SMwas >0.5. The results of classifying samples in the Illumina andAffymetrix data sets using the summary metric and the class probabilityassigned to each decision are presented in Table 9, Table 10, and Table11. For comparison, the class probabilities obtained using the empiricalapproach, normal approximation, and the SCC (independent of SM) are alsoshown. Standardized SM and σ=1 were used. When the Affymetrix sampleswere classified using the 16-gene signature, 2 of the 3 misclassified HPsamples by SCC are deemed uncertain by CLB while assigned P(SSA/P) of75% and 94% by SCC (Table 10).Independent Validation and Clinical Diagnostic Tool.

To further validate the accuracy of the 13 genes molecular signature anddemonstrate its diagnostic value in clinically relevant settings,expression levels were obtained from 45 (24 HPs and 21 SSA/Ps)independent FFPE SSA/P and HP samples with real-time qPCR (see Methods).By simply applying proper normalization and summarizing expressionlevels using the summary metric (see Methods), the 13 genes molecularsignature correctly classified 90% of the independent FFPE samples(Table 12). FIG. 16 shows the scatter plot of the first and secondprinciple components of normalized expression levels. The 13 genesmolecular signature indeed placed HP and SSA/P independent FFPE samplesin two well-separated clusters. This approach is simple and relies onthe ability of the combined 13 genes to properly distinguish between HPand SSA/P, rather than relying on a complex classifier. The stepsrequired to apply this simple approach as a clinical diagnostic tool tonew qPCR samples are summarized in Methods. It is worth mentioning herethat the signature that was found using RNA-seq data from fresh tissuesamples achieved a remarkable correct classification rate despite anypossible RNA degradation in preserved FFPE tissues.

Discussion.

Conventionally, SSA/Ps are distinguished from HPs on the basis ofhistopathological features. Because HPs have similar histopathologicalfeatures, a significant error rate of classifying SSA/P as HP can occur,especially if expert gastrointestinal pathologists are not available.This clinical challenge was the driver of this study, which aimed todevelop biomarker-based test to distinguish between SSA/Ps and HPs.Another challenge was to elucidate molecular mechanisms, contributing tothe differences between SSA/P and HP phenotypes.

Previously, the differences between phenotypes were considered mostly atthe level of individual genes. The genes DE between SSA/Ps and CR (orHP) samples (MUC17, TFF1 and CTSE, SLIT2) were also found in the presentanalysis. In addition, these genes were also DE between CR and CLsamples, so their association with HP and SSA/P phenotypes is uncertain.Among other SSA/Ps potential biomarkers (ANXA10, FABP6 and TTF2), ANXA10was found to be significantly DE between HP and SSA/P samples (Table 5)and TFF2 was found to be significantly DE between SSA/Ps and HP, CRsamples (Table 4). FABP6 was not significantly DE.

To get the systems-level view of the differences between HP and SSA/Pphenotypes the data were analyzed employing different functional units(genes and pathways) as well as different regulatory relationships(differential expression, co-expression). At the level of individualgenes, only genes expressed at the same level in HP and CR samples andsignificantly up- or down-regulated in SSA/Ps were considered. Mostinterestingly, two third of the up-regulated genes wereinterferon-regulated genes, including IDO1. In addition, at the pathwaylevel, ‘Inflammatory response’ and ‘Immunological synapse’ were alsoup-regulated in SSA/Ps as compared to HP, CR samples. IDO has beenimplicated in inflammatory processes; for example, in the mouse model ofDSS induced colitis, it has been shown that IDO1 stimulates aninflammatory response (elevated levels of pro-inflammatory chemokinesand cytokines), the same pathway that was found up-regulated here.However, generally IDO is known as being immunosuppressive: its activitypromotes apoptosis of T-cells, NK cells and induces the differentiationof T regulatory cells (T_(regs)). The mechanism by which IDO mediatesinflammation is not well understood but the connection betweenIDO-mediated inflammation and immunosuppression in tumor cells has beendiscussed. It could be that IDO1 also plays a role in potentiatingSSA/Ps into tumor progression by increasing inflammatory state andfacilitating immune escape, but whether there is a link requires furtherstudy. Other important up-regulated genes and pathways differentiatingSSA/P from HP phenotypes involve cell motility, migration ability, EMTand ECM interaction (FIG. 3 and Table 1) that impact cell invasive andmetastatic behavior, another important hallmark of cancer. Consideringpathways differentially co-expressed between SSA/Ps and HP phenotypes,it was found that hub genes were always different between two phenotypes(R code). For two differentially co-expressed meiosis-related pathways,the shift was from RAD51 to MRE11A, a gene involved in non-homologousrecombination and mismatch repair pathway. One of the most studiedgenotypic subtypes of CRC is that characterized by a deficient mismatchrepair pathway (dMMR), usually found in combination with microsatelliteinstability (MSI). Whether SSA/Ps indeed result in dMMR CRC subtyperemains to be studied. For now, as evidenced by up-regulation ofpathways and genes found, it appears that SSA/Ps are prone to neoplasticchanges most probably because of inflammatory and immune escape state,as well as an increased cell motility and migration ability.

While the computational analysis indeed elucidated genes and pathways DEbetween SSA/Ps and HPs, indicated plausible directions toward tumorprogression and even pointed to existing preventive/treatment options(suppressors of IDO1 and TROP-2), the major goal was more practical: tobuild a molecular classifier accurately differentiating between SSA/Psand HPs. Using RNA-seq data set and the new feature selection strategysuggested here in combination with popular SCC, a molecular classifierthat is applicable to microarray data was developed. The classifier wastested on two independent data sets and resulted in zero (out of 12Illumina samples) and three (out of 17 Affymetrix samples) errors. Thesmallest successful signature for both platforms (13 genes, Table 3)included up-regulated genes previously associated with invasive cellactivities (CLD1, PLA2G16, PTAFR, SPIRE1) and down-regulated checkpointscontrolling cell growth (CHFR, NTRK2). In addition, a simple procedurewas developed that uses the MAD-normalized signatures in Table 2 toclassify new samples as either HP or SSA/P and provides a classprobability for the decision, estimated using Cantelli's inequality. Themedian expression for any gene in any new platform can also becalculated reliably given that enough samples are available. Any newsample from the same platform is then added to re-calculate the medianand perform the MAD normalization. For high throughput platforms wherethousands of genes are profiled, it is possible to calculate theCantelli lower bound for SSA/P and HP probabilities. For other clinicalsettings that profile a few genes (such as real-time qPCR), accurateclassification is also possible (results demonstrated herein) butwithout class assignment probabilities (see Methods). The proposedmolecular classifier demonstrates clinical diagnostic value and it couldbe used to classify future samples profiled with microarray, RNA-seq, orreal-time qPCR platforms. The more accurate diagnosis of patients withSSA/Ps will enable future studies that better define the risk of coloncancer in patients with SSA/Ps, determine if subsets of patients havestratified risks for colon cancer and refine the recommendations forfollow up care of patients with SSA/Ps.

Methods.

RNA-Seq Training Data Set.

The RNA-seq data set used in this study consists of a subset of the NCBIgene expression omnibus (GEO) series with the accession number GSE76987.Ten (10) control left (CL), 10 control right (CR), 10 microvesicularhyperplastic polyps (MVHPs), and 21 sessile serrated adenoma/polyps(SSA/Ps) samples were included. Raw single-end (SE) RNA-seq reads of 50base pairs were provided in FASTQ file format from the ILLUMINA HiSeq2000 platform. To insure high quality reads, the fastX-toolkit (version0.0.13) was employed to discard any read with median Phred score<30. Thesurviving sequence reads were aligned to the UCSC hg19 human referencegenome using Tophat (version 2.0.12). Tophat aligns RNA-seq reads tomammalian-sized genomes using the high-throughput short read alignerBowtie (version 2.2.1) and then analyzes the mapping results to identifysplice junctions between exons. Cufflinks was used to quantify theabundances of genes, taking into account biases in library preparationprotocols. Cufflinks implements a linear statistical model to estimatethe assigned abundance to each transcript that explains the observedreads (especially reads originating from a common exon in severalisoforms of the same gene) with maximum likelihood. The normalized geneexpression values are provided in fragments per kilobase per millions(FPKM) of mapped reads. The log₂(1+FPKM) transformation was applied toFPKM values in all analyses.

Illumine Testing Data Set.

This data set consists of 6 normal colon samples, 6 microvesicularhyperplastic polyps (MVHPs) and 6 sessile serrated adenomas/polyps(SSA/Ps). The total RNA was converted to cDNA and modified using theIllumina DASL-HT assay and hybridized to the Illumina HumanHT-12 WG-DASLV4.0 R2 expression beadchip. The biopsies were classified by sevengastrointestinal pathologists who reviewed 109 serrated polyps andidentified 60 polyps with consensus. The log_(e)-scale of the expressionmeasurements provided under the gene expression omnibus (GEO) accessionnumber GSE43841 was used. Only MVHP and SSA/P samples were consideredfor the analyses. Illumina probe identifiers were mapped to gene symbolidentifiers using the Bioconductor annotation packageilluminaHumanWGDASLv4.db. Whenever multiple probes were mapped to thesame gene, the probe with the largest t-statistic between MVHP and SSA/Pwas selected.

Affymetrix Testing Data Set.

Subsets of samples from two GEO data sets, GSE10714 and GSE45270, wereconsidered. The total RNA was extracted from 11 patients withhyperplastic polyps (HPs) from GSE10714 and from 6 patients with sessileserrated adenoma/polyps (SSPs) from GSE45270. Genome-wide geneexpression profile was evaluated by the HGU133plus2 microarrays fromAffymetrix. The background correction, normalization, and probesummarization steps were implemented using the robust multi-array (RMA)method for the combined samples. Probe identifiers were mapped to genesymbol identifiers using the Bioconductor annotation packagehgu133plus2.db. When multiple probes were mapped to the same gene, theprobe with the largest t-statistic between the 11 HP samples and the 6SSA/P samples was selected.

Biospecimens for Independent Validation Studies.

Formalin-fixed paraffin embedded (FFPE) specimens of SSA/Ps (n=21, sizerange 0.3-3 cm) and HPs (n=24, size range 0.3-0.5 cm) with anunequivocal diagnosis based on the review of at least two independentexpert GI pathologists were analyzed. SSA/Ps were from the right colon(sigmoid flexure to cecum) and HPs were from both the left andtransverse colon. All samples represented unused de-identifiedpathologic specimens that were obtained under IRB approval. Total RNAwas extracted from six to seven 10 μm slices of FFPE tissues using aRNeasy FFPE kit (Qiagen, Germany) according to the manufacturer'sinstructions. The concentration of extracted RNA was determined by QubitRNA HS assays. Reverse transcription reactions were performed utilizinghigh capacity RNA-to-cDNA kit (Applied Biosystems, Carlsbad, Calif.) in20 μL reactions containing 1 μg of RNA, in compliance with themanufacturer's protocol.

qPCR was performed with an ABI 7900HT Fast Real-Time PCR System (AppliedBiosystems, Carlsbad, Calif.). With the exception of SBSPON all primerswere selected from the PrimerBank database[101], and specific primersfor SBSPON were purchased from OriGene Technologies (Rockville, Md.)(Table S11). As a control we utilized human 18S ribosomal RNA (Qiagen,Germany). 15 μL reaction mixtures contained 7.5 μL of PowerUp SYBR green2× master mix (Applied Biosystems, Carlsbad, Calif.), 0.75 μL of eachprimer pair (10 μM), and 20 ng of cDNA. The reaction involved initialdenaturing for 2 minutes at 95° C., followed by 40 cycles of 95° C. for15 seconds and 60° C. for 60 seconds. All analyses were carried out intriplicates.

Differential Expression Analysis.

Differentially expressed (DE) genes were detected using the returnedvalues from the Cuffdiff2 algorithm. Expressed genes with adjustedp-values P_(adj)<0.05 and absolute log₂ fold change>0.5 were consideredDE. P-values were controlled for multiple testing using theBenjamini-Hochberg false discovery rate (FDR) method.

Feature Selection Step (Concordant Genes).

The following algorithm for selecting genes, concordant betweenplatforms, was developed:

-   -   1. Let matrices X=[X₁, . . . , X_(n)] and Y=[Y₁, . . . , Y_(m)]        represent n(m) p-dimensional measurements of gene expression        from two platforms. Let n=n₁+n₂, m=m₁+m₂ where X(Y) has n₁(m₁)        samples that belong to phenotype 1 and n₂(m₂) samples that        belong to phenotype 2.    -   2. Sample without replacement from each platform selecting        min(n₁, m₁) random samples that belong to phenotype 1 and        min(n₂, m₂) random samples that belong to phenotype 2. Find the        Pearson correlation coefficient between the two platforms for        each of the p genes. These correlations are calculated with        actual phenotype labels (ρ_(true)).    -   3. Sample without replacement from each platform selecting        min(n₁, m₁) and min(n₂, m₂) random samples that belong to any        phenotype. Find the Pearson correlation coefficient between the        two platforms for each of the p genes. These correlations are        calculated when samples from both phenotypes are randomly        sampled (ρ_(random)).    -   4. Repeat steps 2 and 3 for a large number of times (we use 10⁴        times) and record the p (number of genes) correlation values in        each step to estimate the distribution of ρ_(true) and        ρ_(random) (see FIG. 13). Calculate pooled standard deviation        for each gene from the two estimated distributions of ρ_(sep)        and ρ_(m,x) and use the maximum value        max(SD(ρ_(true)∪ρ_(random))) for step 5.    -   5. Use the non-parametric Wilcoxon's test of means to test the        one-sided hypothesis H₀: ρ        _(true)≤ρrandom+max(SD(ρ_(true)∪ρ_(random))) against the        alternative H₁: ρ _(true)>ρ _(random)+max (SD(ρ_(true)        ∪ρ_(random)). This test rejects the null hypothesis for genes        that are consistently over-expressed in one phenotype under both        platforms, especially when the within-phenotype variability is        negligible compared to the fold change (see FIG. 5). The term        max(SD(ρ_(true)∪ρ_(random))) can optionally be multiplied by a        constant to increase or decrease the number of genes that        rejects the null hypothesis.

Building the Classifier.

The shrunken centroid classifier (SCC) works as follows: First, itshrinks each phenotype gene centroids towards the overall centroids andstandardizes by the within-phenotype standard deviation of each gene,giving higher weights to genes with stable within-phenotype expression.The centroids of each phenotype deviate from the overall centroids andthe deviation is quantified by the absolute standardized deviation. Theabsolute standardized deviation is compared to a shrinkage threshold andany value smaller than the threshold leads to discarding thecorresponding gene from the classification process.

To select the threshold for the centroid shrinkage, a 3-foldcross-validation over a range of 30 threshold values for 100 iterationswas performed (R package pamr version 1.55). The threshold returning theminimum mean error with the least number of genes was selected. Withinevery iteration, genes' ability to separate between HP and SSA/P sampleswas assessed by calculating the area under the ROC curve (R package ROCRversion 1.0-7) and only genes with AUC>0.8 were left in the signature.The signature was employed with the SCC to classify independentvalidation samples as either HPs or SSA/Ps. For a p-dimensionalvalidation sample X, the classifier calculates a discriminant scoreδ_(k)(X′) for class k and assigns the class with min_(k)(δ_(k)(X′)) asthe classification decision. Discriminant scores are used to estimateclass probabilities (posterior probabilities) as a measure of thecertainty of classification decision

${p_{k}\left( X^{*} \right)} = \frac{e^{{- \frac{1}{2}}{\delta_{k}{(X^{*})}}}}{\sum\limits_{m = 1}^{M}e^{{- \frac{1}{2}}{\delta_{m}{(X^{*})}}}}$where M is the number of classes.

Classification of Independent FFPE Samples.

Expression levels of 13 genes were estimated relative to a referencelevel of a housekeeping gene, such that larger values represent lowerexpression levels and smaller values represent higher expression levels(see FIG. 17). Some samples were positively or negatively biasedrelative to each other (see FIG. 18A). Therefore, raw expression levelswere normalized using two steps. First, raw expressions were shifted bytheir respective sample means or medians to remove any possible positiveor negative biases between samples and center expression levels aroundzero. This step is crucial to reduce technical variation betweensamples. Three options that keep gene ranks in each sample unchanged(arithmetic mean, geometric mean, and median) were tried and nosignificant difference in the classification results was noticed (seeTable 12). It was also found that the quantile normalization whichforces all samples to have similar quantiles yielded lower performance(data not shown). Although subtracting the arithmetic or geometric meanshowed minor improvement in Table 12, subtracting the median isrecommended when outliers are present in some samples. Expression levelsare then multiplied by −1 to let higher expression levels be representedby larger values. Second, the gene-wise MAD normalization was appliedsuch that genes with large fold changes between HP and SSA/P are likelyto have positive values under one phenotype and negative values underthe other. The normalized expression levels are shown in FIG. 18B. Thesummary metric (SM) is used to score each sample and each sample is thenlabeled as HP if SM<0 and as SSA/P if SM>0.

FIG. 14 and FIG. 15 have shown that the distribution of theMAD-normalized expression and the distribution of SM in one RNA-seq andtwo microarray data sets were comparable hence the shrunken centroidclassifier trained with RNA-seq data can be applied successfully toclassify microarray samples. Accurate estimates of the summary metricdistribution for each platform allowed proper standardization of thesummary metric and hence proper phenotype assignment probability usingCLB. While this approach works for high throughput platforms thatprofile thousands of genes, it is not applicable under typical clinicalsettings when qPCR is used to profile only a few genes because thedistribution of SM is unknown. This is why phenotype assignmentprobabilities are not available when platforms that profile a few genes(such as small-scale qPCR) are used.

To classify new qPCR samples using our simple approach, the twonormalization steps above must be applied. R code implementing the twonormalization steps and classifying samples using the summary metric of13 genes is provided in R code below. To apply MAD normalization toreal-time qPCR expression levels, multiple samples are necessary toestimate the median expression level for each gene accurately. Thereforethe raw qPCR expression levels for the FFPE data set (24 HPs and 21SSA/Ps) in Table S10 was provided to allow the normalization of any newqPCR samples. The first normalization step resolves any potential shiftbiases between the new samples and the samples in Table 13.

Software Availability.

The nearest shrunken centroid classifier implementation in R isavailable in the CRAN package pamr. Below provides R code andinstructions on how to apply the simple 13 genes signature to classifynew qPCR samples into either HP or SSA/P.

R Code and Instructions.

# save a copy of Supplementary Table S10 in you working directorysetwd(“working directory here”) # choose “mean”, “geometricMean”, or“median” for sample normalization sample.nor <− “median” # read Table 13FFPEtab <− read.csv(“Table_13.csv”) class.labels <−as.character(FFPEtab[,2]) FFPEmat <− as.matrix(FFPEtab[,3:15])rownames(FFPEmat) <− as.character(FFPEtab[,1]) colnames(FFPEmat) <−colnames(FFPEtab)[3:15] FFPEmat <− t(FFPEmat) # read you new samplesfrom a comma-delimited file # expression levels should occupy one ormore columns # gene names must be in the first column and sample namescan be used new.samples <− read.csv(“new_samples.csv”) new.mat <−as.matrix(new.samples) rownames(new.mat) <−as.character(new.samples[,1]) new.mat <- new.mat[rownames(FFPEmat),] #append new samples to Table 13 FFPEmat <− cbind(FFPEmat, new.mat) #subtract the mean/median from each sample if(sample.nor == “median”) mm<− apply(FFPEmat, 2, “median”) if(sample.nor == “mean”) mm <−apply(FFPEmat, 2, “mean”) if(sample.nor == “geometricMean”) mm <−apply(FFPEmat, 2, function(x){prod(x){circumflex over ( )}length(x)})mat <− matrix(mm, 13, 45, byrow=TRUE) FFPEmat <− FFPEmat − mat # centereach gene's expression around zero # multiply by −1 to let higher valuesrepresent higher expression levels FFPEmat.nor <− −sweep(FFPEmat, 1,apply(FFPEmat, 1, “median”)) # calculate the summary metric (SM) #expression of genes “CHFR”, “CHGA”, and “NTRK2” is multiplied by −1 sig<−c(“CHFR”,“CHGA”,“CLDN1”,“KIZ”,“MEGF6”,“NTRK2”,“PLA2G16”,“PTAFR”,“SBSPON”,“SEMG1”,“SLC7A9”,“SPIRE1”,“TACSTD2”) signature.size <− length(sig)mask <− matrix(1, signature.size, ncol(FFPEmat.nor), byrow=FALSE)mask[c(1,2,6),] <− −1 SM <− colMeans(FFPEmat.nor[sig,]*mask) # if SM>0then sample is classified as SSA/P # else if SM<0 then sample isclassified as HP

REFERENCES

-   1. Zauber, A. G., S. J. Winawer, M. J. O'Brien, I.    Lansdorp-Vogelaar, M. van Ballegooijen, B. F. Hankey, W. Shi, J. H.    Bond, M. Schapiro, J. F. Panish, E. T. Stewart, and J. D. Waye,    Colonoscopic polypectomy and long-term prevention of    colorectal-cancer deaths. N Engl J Med, 2012. 366(8): p. 687-96.-   2. Lieberman, D. A., D. G. Weiss, J. H. Bond, D. J. Ahnen, H.    Garewal, and G. Chejfec, Use of colonoscopy to screen asymptomatic    adults for colorectal cancer. Veterans Affairs Cooperative Study    Group 380. N Engl J Med, 2000. 343(3): p. 162-8.-   3. Levin, B., D. A. Lieberman, B. McFarland, R. A. Smith, D.    Brooks, K. S. Andrews, C. Dash, F. M. Giardiello, S. Glick, T. R.    Levin, P. Pickhardt, D. K. Rex, A. Thorson, S. J. Winawer, G.    American Cancer Society Colorectal Cancer Advisory, U. S. M.-S. T.    Force, and C. American College of Radiology Colon Cancer, Screening    and surveillance for the early detection of colorectal cancer and    adenomatous polyps, 2008: a joint guideline from the American Cancer    Society, the US Multi-Society Task Force on Colorectal Cancer, and    the American College of Radiology. CA Cancer J Clin, 2008. 58(3): p.    130-60.-   4. Quintero, E., A. Castells, L. Bujanda, J. Cubiella, D. Salas, A.    Lanas, M. Andreu, F. Carballo, J. D. Morillas, C. Hernandez, R.    Jover, I. Montalvo, J. Arenas, E. Laredo, V. Hernandez, F.    Iglesias, E. Cid, R. Zubizarreta, T. Sala, M. Ponce, M. Andres, G.    Teruel, A. Perls, M. P. Roncales, M. Polo-Tomas, X. Bessa, O.    Ferrer-Armengou, J. Grau, A. Serradesanferm, A. Ono, J. Cruzado, F.    Perez-Riquelme, I. Alonso-Abreu, M. de la Vega-Prieto, J. M.    Reyes-Melian, G. Cacho, J. Diaz-Tasende, A. Herreros-de-Tejada, C.    Poves, C. Santander, A. Gonzalez-Navarro, and C. S. Investigators,    Colonoscopy versus fecal immunochemical testing in colorectal-cancer    screening. N Engl J Med, 2012. 366(8): p. 697-706.-   5. Limketkai, B. N., D. Lam-Himlin, M. A. Arnold, and C. A. Arnold,    The cutting edge of serrated polyps: a practical guide to    approaching and managing serrated colon polyps. Gastrointest    Endosc, 2013. 77(3): p. 360-75.-   6. Kahi, C. J., D. G. Hewett, D. L. Norton, G. J. Eckert, and D. K.    Rex, Prevalence and variable detection of proximal colon serrated    polyps during screening colonoscopy. Clin Gastroenterol    Hepatol, 2011. 9(1): p. 42-6.-   7. Torlakovic, E. and D. C. Snover, Serrated adenomatous polyposis    in humans. Gastroenterology, 1996. 110(3): p. 748-55.-   8. Kahi, C. J., X. Li, G. J. Eckert, and D. K. Rex, High    colonoscopic prevalence of proximal colon serrated polyps in    average-risk men and women. Gastrointest Endosc, 2012. 75(3): p.    515-20.-   9. Torlakovic, E., E. Skovlund, D. C. Snover, G. Torlakovic,    and J. M. Nesland, Morphologic reappraisal of serrated colorectal    polyps. Am J Surg Pathol, 2003. 27(1): p. 65-81.-   10. Torlakovic, E. E., J. D. Gomez, D. K. Driman, J. R. Parfitt, C.    Wang, T. Benerjee, and D. C. Snover, Sessile serrated adenoma (SSA)    vs. traditional serrated adenoma (TSA). Am J Surg Pathol, 2008.    32(1): p. 21-9.-   11. Lash, R. H., R. M. Genta, and C. M. Schuler, Sessile serrated    adenomas: prevalence of dysplasia and carcinoma in 2139 patients. J    Clin Pathol, 2010. 63(8): p. 681-6.-   12. Rex, D. K., D. J. Ahnen, J. A. Baron, K. P. Batts, C. A.    Burke, R. W. Burt, J. R. Goldblum, J. G. Guillem, C. J. Kahi, M. F.    Kalady, M. J. O'Brien, R. D. Odze, S. Ogino, S. Parry, D. C.    Snover, E. E. Torlakovic, P. E. Wise, J. Young, and J. Church,    Serrated lesions of the colorectum: review and recommendations from    an expert panel. Am J Gastroenterol, 2012. 107(9): p. 1315-29; quiz    1314, 1330.-   13. Payne, S. R., T. R. Church, M. Wandell, T. Rosch, N. Osborn, D.    Snover, R. W. Day, D. F. Ransohoff, and D. K. Rex, Endoscopic    detection of proximal serrated lesions and pathologic identification    of sessile serrated adenomas/polyps vary on the basis of center.    Clin Gastroenterol Hepatol, 2014. 12(7): p. 1119-26.-   14. Tinmouth, J., P. Henry, E. Hsieh, N. N. Baxter, R. J.    Hilsden, S. Elizabeth McGregor, L. F. Paszat, A. Ruco, R.    Saskin, A. J. Schell, E. E. Torlakovic, and L. Rabeneck, Sessile    serrated polyps at screening colonoscopy: have they been under    diagnosed? Am J Gastroenterol, 2014. 109(11): p. 1698-704.-   15. Bettington, M., N. Walker, A. Clouston, I. Brown, B. Leggett,    and V. Whitehall, The serrated pathway to colorectal carcinoma:    current concepts and challenges. Histopathology, 2013. 62(3): p.    367-86.-   16. De Sousa, E. M. F., X. Wang, M. Jansen, E. Fessler, A.    Trinh, L. P. de Rooij, J. H. de Jong, O. J. de Boer, R. van    Leersum, M. F. Bijlsma, H. Rodermond, M. van der Heijden, C. J. van    Noesel, J. B. Tuynman, E. Dekker, F. Markowetz, J. P. Medema, and L.    Vermeulen, Poor-prognosis colon cancer is defined by a molecularly    distinct subtype and develops from serrated precursor lesions. Nat    Med, 2013. 19(5): p. 614-8.-   17. Castaldi, P. J., I. J. Dahabreh, and J. P. loannidis, An    empirical assessment of validation practices for molecular    classifiers. Brief Bioinform, 2011. 12(3): p. 189-202.-   18. Chang, C. Q., S. R. Tingle, K. K. Filipski, M. J. Khoury, T. K.    Lam, S. D. Schully, and J. P. loannidis, An overview of    recommendations and translational milestones for genomic tests in    cancer. Genet Med, 2014.-   19. Chibon, F., Cancer gene expression signatures—the rise and fall?    Eur J Cancer, 2013. 49(8): p. 2000-9.-   20. Shi, W., M. Bessarabova, D. Dosymbekov, Z. Derso, T.    Nikolskaya, M. Dudoladova, T. Serebryiskaya, A. Bugrim, A.    Guryanov, R. J. Brennan, R. Shah, J. Dopazo, M. Chen, Y. Deng, T.    Shi, G. Jurman, C. Furlanello, R. S. Thomas, J. C. Corton, W.    Tong, L. Shi, and Y. Nikolsky, Functional analysis of multiple    genomic signatures demonstrates that classification algorithms    choose phenotype-related genes. Pharmacogenomics J, 2010. 10(4): p.    310-23.-   21. Su, Z., H. Fang, H. Hong, L. Shi, W. Zhang, W. Zhang, Y.    Zhang, Z. Dong, L. J. Lancashire, M. Bessarabova, X. Yang, B.    Ning, B. Gong, J. Meehan, J. Xu, W. Ge, R. Perkins, M. Fischer,    and W. Tong, An investigation of biomarkers derived from legacy    microarray data for their utility in the RNA-seq era. Genome    Biol, 2014. 15(12): p. 523.-   22. Tarca, A. L., M. Lauria, M. Unger, E. Bilal, S. Boue, K. Kumar    Dey, J. Hoeng, H. Koeppl, F. Martin, P. Meyer, P. Nandy, R.    Norel, M. Peitsch, J. J. Rice, R. Romero, G. Stolovitzky, M.    Talikka, Y. Xiang, C. Zechner, and I. D. Collaborators, Strengths    and limitations of microarray-based phenotype prediction: lessons    learned from the IMPROVER Diagnostic Signature Challenge.    Bioinformatics, 2013. 29(22): p. 2892-9.-   23. Alizadeh, A. A., M. B. Eisen, R. E. Davis, C. Ma, I. S.    Lossos, A. Rosenwald, J. C. Boldrick, H. Sabet, T. Tran, X.    Yu, J. I. Powell, L. Yang, G. E. Marti, T. Moore, J. Hudson, Jr., L.    Lu, D. B. Lewis, R. Tibshirani, G. Sherlock, W. C. Chan, T. C.    Greiner, D. D. Weisenburger, J. O. Armitage, R. Warnke, R. Levy, W.    Wilson, M. R. Greyer, J. C. Byrd, D. Botstein, P. O. Brown,    and L. M. Staudt, Distinct types of diffuse large B-cell lymphoma    identified by gene expression profiling. Nature, 2000. 403(6769): p.    503-11.-   24. Dave, S. S., G. Wright, B. Tan, A. Rosenwald, R. D.    Gascoyne, W. C. Chan, R. I. Fisher, R. M. Braziel, L. M.    Rimsza, T. M. Grogan, T. P. Miller, M. LeBlanc, T. C. Greiner, D. D.    Weisenburger, J. C. Lynch, J. Vose, J. O. Armitage, E. B.    Smeland, S. Kvaloy, H. Nolte, J. Delabie, J. M. Connors, P. M.    Lansdorp, Q. Ouyang, T. A. Lister, A. J. Davies, A. J. Norton, H. K.    Muller-Hermelink, G. Ott, E. Campo, E. Montserrat, W. H.    Wilson, E. S. Jaffe, R. Simon, L. Yang, J. Powell, H. Zhao, N.    Goldschmidt, M. Chiorazzi, and L. M. Staudt, Prediction of survival    in follicular lymphoma based on molecular features of    tumor-infiltrating immune cells. N Engl J Med, 2004. 351(21): p.    2159-69.-   25. Lascorz, J., B. Chen, K. Hemminki, and A. Forsti, Consensus    pathways implicated in prognosis of colorectal cancer identified    through systematic enrichment analysis of gene expression profiling    studies. PLoS One, 2011. 6(4): p. e18867.-   26. Salazar, R., P. Roepman, G. Capella, V. Moreno, I. Simon, C.    Dreezen, A. Lopez-Doriga, C. Santos, C. Marijnen, J. Westerga, S.    Bruin, D. Kerr, P. Kuppen, C. van de Velde, H. Morreau, L. Van    Velthuysen, A. M. Glas, L. J. Van't Veer, and R. Tollenaar, Gene    expression signature to improve prognosis prediction of stage II and    III colorectal cancer. J Clin Oncol, 2011. 29(1): p. 17-24.-   27. Gray, R. G., P. Quirke, K. Handley, M. Lopatin, L. Magill, F. L.    Baehner, C. Beaumont, K. M. Clark-Langone, C. N. Yoshizawa, M.    Lee, D. Watson, S. Shak, and D. J. Kerr, Validation study of a    quantitative multigene reverse transcriptase-polymerase chain    reaction assay for assessment of recurrence risk in patients with    stage II colon cancer. J Clin Oncol, 2011. 29(35): p. 4611-9.-   28. Caruso, M., J. Moore, G. J. Goodall, M. Thomas, S. Phillis, A.    Tyskin, G. Cheetham, N. Lerda, H. Takahashi, and A. Ruszkiewicz,    Over-expression of cathepsin E and trefoil factor 1 in sessile    serrated adenomas of the colorectum identified by gene expression    analysis. Virchows Arch, 2009. 454(3): p. 291-302.-   29. Gonzalo, D. H., K. K. Lai, B. Shadrach, J. R. Goldblum, A. E.    Bennett, E. Downs-Kelly, X. Liu, W. Henricks, D. T. Patil, P.    Carver, J. Na, B. Gopalan, L. Rybicki, and R. K. Pai, Gene    expression profiling of serrated polyps identifies annexin A10 as a    marker of a sessile serrated adenoma/polyp. J Pathol, 2013.    230(4): p. 420-9.-   30. Delker, D. A., B. M. McGettigan, P. Kanth, S. Pop, D. W.    Neklason, M. P. Bronner, R. W. Burt, and C. H. Hagedorn, RNA    sequencing of sessile serrated colon polyps identifies    differentially expressed genes and immunohistochemical markers. PLoS    One, 2014. 9(2): p. e88367.-   31. Glebov, O. K., L. M. Rodriguez, K. Nakahara, J. Jenkins, J.    Cliatt, C. J. Humbyrd, J. DeNobile, P. Soballe, R. Simon, G.    Wright, P. Lynch, S. Patterson, H. Lynch, S. Gallinger, A.    Buchbinder, G. Gordon, E. Hawk, and I. R. Kirsch, Distinguishing    right from left colon by the pattern of gene expression. Cancer    Epidemiol Biomarkers Prev, 2003. 12(8): p. 755-62.-   32. Hanahan, D. and R. A. Weinberg, Hallmarks of cancer: the next    generation. Cell, 2011. 144(5): p. 646-74.-   33. Galamb, O., F. Sipos, N. Solymosi, S. Spisak, T. Krenacs, K.    Toth, Z. Tulassay, and B. Molnar, Diagnostic mRNA expression    patterns of inflamed, benign, and malignant colorectal biopsy    specimen and their correlation with peripheral blood results. Cancer    Epidemiol Biomarkers Prev, 2008. 17(10): p. 2835-45.-   34. Saeys, Y., I. Inza, and P. Larranaga, A review of feature    selection techniques in bioinformatics. Bioinformatics, 2007.    23(19): p. 2507-17.-   35. Tibshirani, R., T. Hastie, B. Narasimhan, and G. Chu, Diagnosis    of multiple cancer types by shrunken centroids of gene expression.    Proc Natl Acad Sci USA, 2002. 99(10): p. 6567-72.-   36. Li, H. J., S. K. Ray, N. K. Singh, B. Johnston, and A. B.    Leiter, Basic helix-loop—helix transcription factors and    enteroendocrine cell differentiation. Diabetes Obes Metab, 2011. 13    Suppl 1: p. 5-12.-   37. Scolnick, D. M. and T. D. Halazonetis, Chfr defines a mitotic    stress checkpoint that delays entry into metaphase. Nature, 2000.    406(6794): p. 430-5.-   38. Yu, X., K. Minter-Dykhouse, L. Malureanu, W. M. Zhao, D.    Zhang, C. J. Merkle, I. M. Ward, H. Saya, G. Fang, J. van Deursen,    and J. Chen, Chfr is required for tumor suppression and Aurora A    regulation. Nat Genet, 2005. 37(4): p. 401-6.-   39. Cleven, A. H., S. Derks, M. X. Draht, K. M. Smits, V.    Melotte, L. Van Neste, B. Tournier, V. Jooste, C. Chapusot, M. P.    Weijenberg, J. G. Herman, A. P. de Bruine, and M. van Engeland, CHFR    promoter methylation indicates poor prognosis in stage II    microsatellite stable colorectal cancer. Clin Cancer Res, 2014.    20(12): p. 3261-71.-   40. Yamanami, H., K. Shiozaki, T. Wada, K. Yamaguchi, T. Uemura, Y.    Kakugawa, T. Hujiya, and T. Miyagi, Down-regulation of sialidase    NEU4 may contribute to invasive properties of human colon cancers.    Cancer Sci, 2007. 98(3): p. 299-307.-   41. Samarajiwa, S. A., S. Forster, K. Auchettl, and P. J. Hertzog,    INTERFEROME: the database of interferon regulated genes. Nucleic    Acids Res, 2009. 37(Database issue): p. D852-7.-   42. de Veer, M. J., M. Holko, M. Frevel, E. Walker, S. Der, J. M.    Paranjape, R. H. Silverman, and B. R. Williams, Functional    classification of interferon-stimulated genes identified using    microarrays. J Leukoc Biol, 2001. 69(6): p. 912-20.-   43. Carrega, P., S. Campana, I. Bonaccorsi, and G. Ferlazzo, The Yin    and Yang of Innate Lymphoid Cells in Cancer. Immunol Lett, 2016.-   44. Wang, G., X. Yang, C. Li, X. Cao, X. Luo, and J. Hu, PIK3R3    induces epithelial-to-mesenchymal transition and promotes metastasis    in colorectal cancer. Mol Cancer Ther, 2014. 13(7): p. 1837-47.-   45. Zhang, J. X., X. X. Huang, M. B. Cai, Z. T. Tong, J. W. Chen, D.    Qian, Y. J. Liao, H. X. Deng, D. Z. Liao, M. Y. Huang, Y. X.    Zeng, D. Xie, and S. J. Mai, Overexpression of the secretory small    GTPase Rab27B in human breast cancer correlates closely with lymph    node metastasis and predicts poor prognosis. J Transl Med, 2012.    10: p. 242.-   46. Hamada, S., K. Satoh, A. Masamune, and T. Shimosegawa,    Regulators of epithelial mesenchymal transition in pancreatic    cancer. Front Physiol, 2012. 3: p. 254.-   47. Ball, H. J., H. J. Yuasa, C. J. Austin, S. Weiser, and N. H.    Hunt, Indoleamine-   2,3-dioxygenase-2; a new enzyme in the kynurenine pathway. Int J    Biochem Cell Biol, 2009. 41(3): p. 467-71.-   48. Fallarino, F., U. Grohmann, C. Vacca, C. Orabona, A.    Spreca, M. C. Fioretti, and P. Puccetti, T cell apoptosis by    kynurenines. Adv Exp Med Biol, 2003. 527: p. 183-90.-   49. Uyttenhove, C., L. Pilotte, I. Theate, V. Stroobant, D.    Colau, N. Parmentier, T. Boon, and B. J. Van den Eynde, Evidence for    a tumoral immune resistance mechanism based on tryptophan    degradation by indoleamine 2,3-dioxygenase. Nat Med, 2003. 9(10): p.    1269-74.-   50. Opitz, C. A., U. M. Litzenburger, U. Opitz, F. Sahm, K. Ochs, C.    Lutz, W. Wick, and M. Platten, The indoleamine-2,3-dioxygenase (IDO)    inhibitor 1-methyl-D-tryptophan upregulates IDO1 in human cancer    cells. PLoS One, 2011. 6(5): p. e19823.-   51. Iansante, V., P. M. Choy, S. W. Fung, Y. Liu, J. G. Chai, J.    Dyson, A. Del Rio, C. D'Santos, R. Williams, S. Chokshi, R. A.    Anders, C. Bubici, and S. Papa, PARP14 promotes the Warburg effect    in hepatocellular carcinoma by inhibiting JNK1-dependent PKM2    phosphorylation and activation. Nat Commun, 2015. 6: p. 7882.-   52. Bao, J., Y. Ni, H. Qin, L. Xu, Z. Ge, F. Zhan, H. Zhu, J.    Zhao, X. Zhou, X. Tang, and L. Tang, Rab27b is a potential predictor    for metastasis and prognosis in colorectal cancer. Gastroenterol Res    Pract, 2014. 2014: p. 913106.-   53. Hendrix, A., D. Maynard, P. Pauwels, G. Braems, H. Denys, R. Van    den Broecke, J. Lambert, S. Van Belle, V. Cocquyt, C. Gespach, M.    Bracke, M. C. Seabra, W. A. Gahl, O. De Wever, and W. Westbroek,    Effect of the secretory small GTPase Rab27B on breast cancer growth,    invasion, and metastasis. J Natl Cancer Inst, 2010. 102(12): p.    866-80.-   54. Li, J., M. S. Weinberg, L. Zerbini, and S. Prince, The oncogenic    TBX3 is a downstream target and mediator of the TGF-beta 1 signaling    pathway. Mol Biol Cell, 2013. 24(22): p. 3569-76.-   55. Shan, Z. Z., X. B. Yan, L. L. Yan, Y. Tian, Q. C. Meng, W. W.    Qiu, Z. Zhang, and Z. M. Jin, Overexpression of Tbx3 is correlated    with Epithelial-Mesenchymal Transition phenotype and predicts poor    prognosis of colorectal cancer. Am J Cancer Res, 2015. 5(1): p.    344-53.-   56. Baron, K. D., K. Al-Zahrani, J. Conway, C. Labreche, C. J.    Storbeck, J. E. Visvader, and L. A. Sabourin, Recruitment and    activation of SLK at the leading edge of migrating cells requires    Src family kinase activity and the LIM-only protein 4. Biochim    Biophys Acta, 2015. 1853(7): p. 1683-92.-   57. Byrne, J. A., S. Frost, Y. Chen, and R. K. Bright, Tumor protein    D52 (TPD52) and cancer-oncogene understudy or understudied oncogene?    Tumour Biol, 2014. 35(8): p. 7369-82.-   58. Owens, S. R., S. I. Chiosea, and S. F. Kuan, Selective    expression of gastric mucin MUC6 in colonic sessile serrated adenoma    but not in hyperplastic polyp aids in morphological diagnosis of    serrated polyps. Mod Pathol, 2008. 21(6): p. 660-9.-   59. Bartley, A. N., P. A. Thompson, J. A. Buckmeier, C. Y.    Kepler, C. H. Hsu, M. S. Snyder, P. Lance, A. Bhattacharyya,    and S. R. Hamilton, Expression of gastric pyloric mucin, MUC6, in    colorectal serrated polyps. Mod Pathol, 2010. 23(2): p. 169-76.-   60. Gibson, J. A., H. P. Hahn, A. Shahsafaei, and R. D. Odze, MUC    expression in hyperplastic and serrated colonic polyps: lack of    specificity of MUC6. Am J Surg Pathol, 2011. 35(5): p. 742-9.-   61. Oshimori, N., M. Ohsugi, and T. Yamamoto, The Plk1 target Kizuna    stabilizes mitotic centrosomes to ensure spindle bipolarity. Nat    Cell Biol, 2006. 8(10): p. 1095-101.-   62. Lagal, V., M. Abrivard, V. Gonzalez, A. Perazzi, S. Popli, E.    Verzeroli, and I. Tardieux, Spire-1 contributes to the invadosome    and its associated invasive properties. J Cell Sci, 2014. 127(Pt    2): p. 328-40.-   63. Ashburner, M., C. A. Ball, J. A. Blake, D. Botstein, H.    Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T.    Eppig, M. A. Harris, D. P. Hill, L. Issel-Tarver, A. Kasarskis, S.    Lewis, J. C. Matese, J. E. Richardson, M. Ringwald, G. M. Rubin,    and G. Sherlock, Gene ontology: tool for the unification of biology.    The Gene Ontology Consortium. Nat Genet, 2000. 25(1): p. 25-9.-   64. Liberzon, A., A. Subramanian, R. Pinchback, H.    Thorvaldsdottir, P. Tamayo, and J. P. Mesirov, Molecular signatures    database (MSigDB) 3.0. Bioinformatics, 2011. 27(12): p. 1739-40.-   65. Wu, D., E. Lim, F. Vaillant, M. L. Asselin-Labat, J. E.    Visvader, and G. K. Smyth, ROAST: rotation gene set tests for    complex microarray experiments. Bioinformatics, 2010. 26(17): p.    2176-82.-   66. Weiss, A. and L. Attisano, The TGFbeta superfamily signaling    pathway. Wiley Interdiscip Rev Dev Biol, 2013. 2(1): p. 47-63.-   67. Rahmatallah, Y., F. Emmert-Streib, and G. Glazko, Gene Sets Net    Correlations Analysis (GSNCA): a multivariate differential    coexpression test for gene sets. Bioinformatics, 2014. 30(3): p.    360-8.-   68. McVey, M. and S. E. Lee, MMEJ repair of double-strand breaks    (director's cut): deleted sequences and alternative endings. Trends    Genet, 2008. 24(11): p. 529-38.-   69. Ishigooka, S., M. Nomoto, N. Obinata, Y. Oishi, Y. Sato, S.    Nakatsu, M. Suzuki, Y. Ikeda, T. Maehata, T. Kimura, Y. Watanabe, T.    Nakajima, H. O. Yamano, H. Yasuda, and F. Itoh, Evaluation of    magnifying colonoscopy in the diagnosis of serrated polyps. World J    Gastroenterol, 2012. 18(32): p. 4308-16.-   70. Manning, S. and R. L. Batterham, The role of gut hormone peptide    YY in energy and glucose homeostasis: twelve years on. Annu Rev    Physiol, 2014. 76: p. 585-608.-   71. EI-Salhy, M., T. Mazzawi, D. Gundersen, J. G. Hatlebakk, and T.    Hausken, The role of peptide YY in gastrointestinal diseases and    disorders (review). Int J Mol Med, 2013. 31(2): p. 275-82.-   72. Newish, M., C. J. Lord, S. A. Martin, D. Cunningham, and A.    Ashworth, Mismatch repair deficient colorectal cancer in the era of    personalized treatment. Nat Rev Clin Oncol, 2010. 7(4): p. 197-208.-   73. Pavelitz, T., L. Renfro, N. R. Foster, A. Caracol, P.    Welsch, V. V. Lao, W. B. Grady, D. Niedzwiecki, L. B. Saltz, M. M.    Bertagnolli, R. M. Goldberg, P. S. Rabinovitch, M. Emond, R. J.    Monnat, Jr., and N. Maizels, MRE11-deficiency associated with    improved long-term disease free survival and overall survival in a    subset of stage III colon cancer patients in randomized CALGB 89803    trial. PLoS One, 2014. 9(10): p. e108483.-   74. Ambroise, C. and G. J. McLachlan, Selection bias in gene    extraction on the basis of microarray gene-expression data. Proc    Natl Acad Sci USA, 2002. 99(10): p. 6562-6.-   75. Simon, R., Roadmap for developing and validating therapeutically    relevant genomic classifiers. J Clin Oncol, 2005. 23(29): p.    7332-41.-   76. Trapnell, C., D. G. Hendrickson, M. Sauvageau, L. Goff, J. L.    Rinn, and L. Pachter, Differential analysis of gene regulation at    transcript resolution with RNA-seq. Nat Biotechnol, 2013. 31(1): p.    46-53.-   77. Fumagalli, D., A. Blanchet-Cohen, D. Brown, C. Desmedt, D.    Gacquer, S. Michiels, F. Rothe, S. Majjaj, R. Salgado, D.    Larsimont, M. Ignatiadis, M. Maetens, M. Piccart, V. Detours, C.    Sotiriou, and B. Haibe-Kains, Transfer of clinically relevant gene    expression signatures in breast cancer: from Affymetrix microarray    to Illumina RNA-Sequencing technology. BMC Genomics, 2014. 15: p.    1008.-   78. Marioni, J. C., C. E. Mason, S. M. Mane, M. Stephens, and Y.    Gilad, RNA-seq: an assessment of technical reproducibility and    comparison with gene expression arrays. Genome Res, 2008. 18(9): p.    1509-17.-   79. Wang, C., B. Gong, P. R. Bushel, J. Thierry-Mieg, D.    Thierry-Mieg, J. Xu, H. Fang, H. Hong, J. Shen, Z. Su, J. Meehan, X.    Li, L. Yang, H. Li, P. P. Labaj, D. P. Kreil, D. Megherbi, S.    Gaj, F. Caiment, J. van Delft, J. Kleinjans, A. Scherer, V.    Devanarayan, J. Wang, Y. Yang, H. R. Qian, L. J. Lancashire, M.    Bessarabova, Y. Nikolsky, C. Furlanello, M. Chierici, D.    Albanese, G. Jurman, S. Riccadonna, M. Filosi, R. Visintainer, K. K.    Zhang, J. Li, J. H. Hsieh, D. L. Svoboda, J. C. Fuscoe, Y. Deng, L.    Shi, R. S. Paules, S. S. Auerbach, and W. Tong, The concordance    between RNA-seq and microarray data depends on chemical treatment    and transcript abundance. Nat Biotechnol, 2014. 32(9): p. 926-32.-   80. Zhao, P., H. Z. Yu, and J. H. Cai, Clinical investigation of    TROP-2 as an independent biomarker and potential therapeutic target    in colon cancer. Mol Med Rep, 2015. 12(3): p. 4364-9.-   81. Fang, Y. J., Z. H. Lu, G. Q. Wang, Z. Z. Pan, Z. W. Zhou, J. P.    Yun, M. F. Zhang, and D. S. Wan, Elevated expressions of MMP7,    TROP2, and survivin are associated with survival, disease    recurrence, and liver metastasis of colon cancer. Int J Colorectal    Dis, 2009. 24(8): p. 875-84.-   82. Starodub, A. N., A. J. Ocean, M. A. Shah, M. J. Guarino, V. J.    Picozzi, Jr., L. T. Vandat, S. S. Thomas, S. V. Govindan, P. P.    Maliakal, W. A. Wegener, S. A. Hamburger, R. M. Sharkey, and D. M.    Goldenberg, First-in-Human Trial of a Novel Anti-Trop-2    Antibody-SN-38 Conjugate, Sacituzumab Govitecan, for the Treatment    of Diverse Metastatic Solid Tumors. Clin Cancer Res, 2015.    21(17): p. 3870-8.-   83. Pope, J. L., R. Ahmad, A. A. Bhat, M. K. Washington, A. B.    Singh, and P. Dhawan, Claudin-1 overexpression in intestinal    epithelial cells enhances susceptibility to adenamatous polyposis    coli-mediated colon tumorigenesis. Mol Cancer, 2014. 13: p. 167.-   84. Kim, J. C., Y. J. Ha, K. H. Tak, S. A. Roh, C. W. Kim, T. W.    Kim, S. K. Kim, S. Y. Kim, D. H. Cho, and Y. S. Kim, Complex    Behavior of ALDH1A1 and IGFBP1 in Liver Metastasis from a Colorectal    Cancer. PLoS One, 2016. 11(5): p. e0155160.-   85. Xiong, S., H. Tu, M. Kollareddy, V. Pant, Q. Li, Y. Zhang, J. G.    Jackson, Y. A. Suh, A. C. Elizondo-Fraire, P. Yang, G. Chau, M.    Tashakori, A. R. Wasylishen, Z. Ju, H. Solomon, V. Rotter, B.    Liu, A. K. El-Naggar, L. A. Donehower, L. A. Martinez, and G.    Lozano, Pla2g16 phospholipase mediates gain-of-function activities    of mutant p53. Proc Natl Acad Sci USA, 2014. 111(30): p. 11145-50.-   86. Chen, J., T. Lan, W. Zhang, L. Dong, N. Kang, S. Zhang, M.    Fu, B. Liu, K. Liu, and Q. Zhan, Feed-Forward Reciprocal Activation    of PAFR and STATS Regulates Epithelial-Mesenchymal Transition in    Non-Small Cell Lung Cancer. Cancer Res, 2015. 75(19): p. 4198-210.-   87. Walpole, R., R. Myers, and S. Myers, Probability and statistics    for engineers and scientists., 1998, Prentice Hall.-   88. Savage, R., Probability Inequalities of the Tchebycheff Type.    JOURNAL OF RESEARCH of the National Bureau of Standards—B.    Mathematics and Mathematical Physics 1961. 65B(3): p. 211-226.-   89. Higuchi, T. and J. R. Jass, My approach to serrated polyps of    the colorectum. J Clin Pathol, 2004. 57(7): p. 682-6.-   90. Beggs, A. D., A. Jones, N. Shepherd, A. Arnaout, C.    Finlayson, A. M. Abulafi, D. G. Morton, G. M. Matthews, S. V.    Hodgson, and I. P. Tomlinson, Loss of expression and promoter    methylation of SLIT2 are associated with sessile serrated adenoma    formation. PLoS Genet, 2013. 9(5): p. e1003488.-   91. Shon, W. J., Y. K. Lee, J. H. Shin, E. Y. Choi, and D. M. Shin,    Severity of DSS—induced colitis is reduced in Ido1-deficient mice    with down-regulation of TLR-MyD88-NF-kB transcriptional networks.    Sci Rep, 2015. 5: p. 17305.-   92. Prendergast, G. C., C. Smith, S. Thomas, L. Mandik-Nayak, L.    Laury-Kleintop, R. Metz, and A. J. Muller, Indoleamine    2,3-dioxygenase pathways of pathogenic inflammation and immune    escape in cancer. Cancer Immunol Immunother, 2014. 63(7): p. 721-35.-   93. Prendergast, G. C., R. Metz, and A. J. Muller, Towards a genetic    definition of cancer-associated inflammation: role of the IDO    pathway. Am J Pathol, 2010. 176(5): p. 2082-7.-   94. Trapnell, C., A. Roberts, L. Goff, G. Pertea, D. Kim, D. R.    Kelley, H. Pimentel, S. L. Salzberg, J. L. Rinn, and L. Pachter,    Differential gene and transcript expression analysis of RNA-seq    experiments with TopHat and Cufflinks. Nat Protoc, 2012. 7(3): p.    562-78.-   95. Langmead, B., C. Trapnell, M. Pop, and S. L. Salzberg, Ultrafast    and memory-efficient alignment of short DNA sequences to the human    genome. Genome Biol, 2009. 10(3): p. R25.-   96. Irizarry, R. A., B. Hobbs, F. Collin, Y. D.    Beazer-Barclay, K. J. Antonellis, U. Scherf, and T. P. Speed,    Exploration, normalization, and summaries of high density    oligonucleotide array probe level data. Biostatistics, 2003.    4(2): p. 249-64.-   97. Mestdagh, P., P. Van Vlierberghe, A. De Weer, D. Muth, F.    Westermann, F. Speleman, and J. Vandesompele, A novel and universal    method for microRNA RT-qPCR data normalization. Genome Biol, 2009.    10(6): p. R64.

All cited references are herein expressly incorporated by reference intheir entirety.

Whereas particular embodiments have been described above for purposes ofillustration, it will be appreciated by those skilled in the art thatnumerous variations of the details may be made without departing fromthe disclosure as described in the appended claims.

TABLE 1 Up-regulated pathways (GO categories). Category Pathway FDR Celladhesion CALCIUM_INDEPENDENT_CELL_CELL_ADHESION 0.022CELL_SUBSTRATE_ADHERENS_JUNCTION 0.042 Cell growthCELL_STRUCTURE_DISASSEMBLY_DURING_APOPTOSIS 0.033 and deathPOSITIVE_REGULATION_OF_CELL_PROLIFERATION 0.033 Immune systemINFLAMMATORY_RESPONSE 0.033 IMMUNOLOGICAL_SYNAPSE 0.045 SignalPOSITIVE_REGULATION_OF_SECRETION 0.045 transductionG_PROTEIN_COUPLED_RECEPTOR_PROTEIN_SIGNALING 0.042SECOND_MESSENGER_MEDIATED_SIGNALING 0.045 MetabolismAROMATIC_COMPOUND_METABOLIC_PROCESS 0.022 HETEROCYCLE_METABOLIC_PROCESS0.022 Differentiation CELLULAR_MORPHOGENESIS_DURING_DIFFERENTIATION0.045 Cellular componentEXTRACELLULAR_STRUCTURE_ORGANIZATION_AND_BIOGENESIS 0.042 organizationNeuron AXONOGENESIS 0.042 development NEURITE_DEVELOPMENT 0.045

TABLE 2 Performance of the nearest shrunken centroid classifier inclassifying independent SSA/P and HP samples acquired by microarrayplatforms using 3 signatures. Signature Ilium. Affy. PlatformsConcordant genes size Signature errors errors Training: C4BPA, CEMIP,CHGA, CLDN1, CPE, 18 C4BPA, CHGA, CLDN1, CPE, DPP10, 0 — RNA-seq DPP10,FSIP2, GRAMD1B, GRIN2D, GRAMD1 B, GRIN2D, KIZ, KLK7, Testing: IL2RG,KIZ, KLK7, MEGF6, MYCN, MEGF6, MYCN, NTRK2, PLA2G16, Illumina NTRK2,PLA2G16, RAMP1, SBSPON, SBSPON, SEMG1 , SLC7A9, SPIRE1, SEMG1, SLC7A9,SPIRE1 , TM4SF4 TM4SF4 Training: CLDN1 , FOXD1 , IDO1, IL2RG, KIZ, 16CLDN1 , FOXD1 , KIZ, MEGF6, NTRK2, — 3 RNA-seq LMO4, MEGF6, NTRK2,PIK3R3, PIK3R3, PLA2G16, PRUNE2, PTAFR, Testing: PLA2G16, PRUNE2, PTAFR,SBSPON, SBSPON, SEMG1 , SLC7A9, SPIRE1, Affymetrix SEMG1 , SLC7A9,SPIRE1 , TACSTD2, TACSTD2, TPD52L1 , TRIB2 TPD52L1, TRIB2, ZIC2Training: CHFR, CHGA, CLDN1, IL2RG, 13 CHFR, CHGA, CLDN1 , KIZ, MEGF6, 03 RNA-seq KIZ, MEGF6, NTRK2, PLA2G16, NTRK2, PLA2G16, PTAFR, SBSPON,Testing: PTAFR, SBSPON, SEMG1, SEMG1 , SLC7A9, SPIRE1 , TACSTD2 IlluminaSLC7A9, SPIRE1 , TACSTD2, and VSIG1, ZIC2 Affymetrix

TABLE 3 Genes included in the smallest 13 genes signature. Gene log₂FCFC Description SLC7A9 3.22 9.34 solute carrier family 7 member 9 SEMG12.95 7.72 semenogelin I MEGF6 2.66 6.34 multiple EGF like domains 6TACSTD2 1.93 3.82 tumor-associated calcium signal transducer 2 CLDN11.85 3.59 claudin 1 SBSPON 1.23 2.35 somatomedin B and thrombospondintype 1 domain containing PLA2G16 1.18 2.27 phospholipase A2 group XVIPTAFR 1.08 2.11 platelet activating factor receptor KIZ 0.98 1.98 kizunacentrosomal protein SPIRE1 0.82 1.76 spire type actin nucleation factor1 CHFR −0.62 0.65 checkpoint with forkhead and ring finger domains, E3ubiquitin protein ligase CHGA −1.63 0.32 chromogranin A NTRK2 −2.32 0.20neurotrophic tyrosine kinase, receptor, type 2

TABLE 4 List of 139 genes DE between HP and SSA/P and between CR andSSA/P but not between CR and CL. mean gene locus mean_HP SSA/P log₂FCtest_stat p_value p_(adj) _(—) value Description 1 KLK7 chr19: 51479734-0.17 6.77 5.33 2.90 2.50E−04 1.03E−02 kallikrein-related peptidase 751487320 2 MUC6 chr11: 1012823- 0.10 2.41 4.58 2.92 5.00E−05 2.53E−03mucin 6, oligomeric mucus/gel- 1036706 forming 3 GRIN2D chr19: 48898131-0.50 5.36 3.43 2.87 5.00E−05 2.53E−03 glutamate receptor, ionotropic,48948188 N-methyl D-aspartate 2D 4 SLC7A9 chr19: 33321418- 0.28 2.633.22 1.96 5.00E−05 2.53E−03 solute carrier family 7 (amino 33360683 acidtransporter light chain, bo, +system), member 9 5 SEMG1 chr20: 43835637-0.38 2.97 2.95 1.99 5.00E−05 2.53E−03 semenogelin I 43838414 6 AMHchr19: 2249112- 0.33 2.50 2.91 1.84 5.00E−05 2.53E−03 anti-Mullerianhormone 2252072 7 MEGF6 chr1: 3404505- 1.45 9.20 2.66 2.81 5.00E−052.53E−03 multiple EGF-like-domains 6 3528059 8 ZIC2 chr13: 100634025-0.40 2.33 2.56 1.71 5.00E−05 2.53E−03 Zic family member 2 100639019 9TM4SF4 chr3: 149192367- 14.79 82.89 2.49 2.58 5.00E−05 2.53E−03transmembrane 4 L six family 149221181 member 4 10 CA9 chr9: 35673914-0.73 4.02 2.46 1.38 5.00E−05 2.53E−03 carbonic anhydrase IX 35681154 11CXCL9 chr4: 76922622- 1.79 9.09 2.35 1.94 5.00E−05 2.53E−03 chemokine(C-X-C motif) 76928641 ligand 9 12 CXCL10 chr4: 76932332- 2.62 13.272.34 1.35 1.40E−03 3.88E−02 chemokine (C-X-C motif) 77033955 ligand 1013 CLDN2 chrX: 106143292- 1.37 6.53 2.25 2.16 5.00E−05 2.53E−03 claudin2 106174091 14 CNTD2 chr19: 40728114- 0.37 1.71 2.22 1.28 1.50E−046.74E−03 cyclin N-terminal domain 40732597 containing 2 15 DEFA5 chr8:6912828- 4.34 19.74 2.19 1.34 1.50E−04 6.74E−03 defensin, alpha 5,Paneth cell- 6914259 specific 16 FOXD1 chr5: 72742084- 0.70 3.15 2.171.39 1.00E−04 4.83E−03 forkhead box D1 72744352 17 NR0B2 chr1: 27237974-1.54 6.46 2.07 1.65 5.00E−05 2.53E−03 nuclear receptor subfamily 0,27240567 group B, member 2 18 C4BPA chr1: 207277606- 1.27 5.25 2.04 1.695.00E−05 2.53E−03 complement component 4 207318317 binding protein,alpha 19 MYCN chr2: 16076386- 0.53 2.16 2.03 1.50 5.00E−05 2.53E−03v-myc avian myelocytomatosis 16087129 viral oncogene neuroblastomaderived homolog 20 PLA2G3 chr22: 31530792- 0.39 1.53 1.98 1.45 5.00E−052.53E−03 phospholipase A2, group III 31536593 21 MSX2 chr5: 174151574-0.32 1.26 1.97 1.26 4.50E−04 1.62E−02 msh homeobox 2 174157902 22 URADchr13: 28552242- 3.45 13.28 1.95 1.41 5.00E−05 2.53E−03ureidoimidazoline (2-oxo-4- 28562774 hydroxy-4-carboxy-5-) decarboxylase23 TACSTD2 chr1: 59041094- 3.03 11.57 1.93 1.87 5.00E−05 2.53E−03tumor-associated calcium 59043166 signal transducer 2 24 NOS2 chr17:26083791- 6.56 24.21 1.88 1.66 5.00E−05 2.53E−03 nitric oxide synthase2, 26127555 inducible 25 CLDN1 chr3: 190023489- 2.38 8.54 1.85 2.165.00E−05 2.53E−03 claudin 1 190040235 26 TFF2 chr21: 43766466- 42.79152.57 1.83 2.01 5.00E−05 2.53E−03 trefoil factor 2 43771208 27 TLX1chr10: 102891060- 0.44 1.52 1.77 1.25 1.75E−03 4.53E−02 T-cell leukemiahomeobox 1 102897546 28 KLK11 chr19: 51525486- 10.61 34.70 1.71 2.125.00E−05 2.53E−03 kallikrein-related 51531290 peptidase 11 29LOC102723854 chr2: 43254991- 0.52 1.68 1.69 0.87 1.90E−03 4.85E−02uncharacterized 43266682 LOC102723854 30 CYP4X1 chr1: 47489239- 0.310.97 1.66 1.14 5.00E−05 2.53E−03 cytochrome P450, family 4, 47516423subfamily X, polypeptide 1 31 MMP1 chr11: 102654406- 5.37 16.68 1.641.80 5.00E−05 2.53E−03 matrix metallopeptidase 1 102714342 32 ATG9Bchr7: 150688143- 1.11 3.44 1.63 1.36 5.50E−04 1.86E−02 autophagy related9B 150721586 33 GRAMD1B chr11: 123396343- 0.78 2.37 1.60 1.62 5.00E−052.53E−03 GRAM domain containing 1B 123498479 34 WDR72 chr15: 53805937-0.79 2.37 1.59 1.52 5.00E−05 2.53E−03 WD repeat domain 72 54055075 35APOL1 chr22: 36649116- 15.22 44.08 1.53 2.10 5.00E−05 2.53E−03apolipoprotein L, 1 36663577 36 RNF183 chr9: 116059372- 0.70 1.88 1.430.96 4.50E−04 1.62E−02 ring finger protein 183 116061320 37 CEMIP chr15:81071711- 1.45 3.91 1.43 1.66 5.00E−05 2.53E−03 cell migration inducing81243999 protein, hyaluronan binding 38 LYPD5 chr19: 44300078- 0.66 1.771.43 1.07 4.50E−04 1.62E−02 LY6/PLAUR domain 44324808 containing 5 39KLK10 chr19: 51515999- 15.31 41.05 1.42 1.69 5.00E−05 2.53E−03kallikrein-related 51523431 peptidase 10 40 HLA-DRB5 chr6: 32485153-7.40 19.67 1.41 1.36 5.00E−05 2.53E−03 major histocompatibility 32498006complex, class II, DR beta 5 41 RAMP1 chr2: 238768186- 2.09 5.53 1.411.21 3.00E−04 1.18E−02 receptor (G protein-coupled) 238820755 activitymodifying protein 1 42 IDO1 chr8: 39771327- 2.18 5.76 1.40 1.32 5.00E−052.53E−03 indoleamine 2,3-dioxygenase 1 39786309 43 NBPF7 chr1:120377387- 1.43 3.78 1.40 1.29 5.00E−05 2.53E−03 neuroblastomabreakpoint 120387503 family, member 7 44 UBD chr6_qbl_hap6: 6.92 17.761.36 1.25 5.00E−05 2.53E−03 ubiquitin D 826706- 831021 45 SLFN5 chr17:33570085- 1.05 2.60 1.31 1.40 5.00E−05 2.53E−03 schlafen family member 533594761 46 APOD chr3: 195295572- 4.78 11.69 1.29 1.22 5.00E−05 2.53E−03apolipoprotein D 195311076 47 GBP4 chr1: 89646830- 2.19 5.23 1.26 1.515.00E−05 2.53E−03 guanylate binding protein 4 89664633 48 CARD6 chr5:40841409- 1.51 3.58 1.24 1.41 5.00E−05 2.53E−03 caspase recruitmentdomain 40855456 family, member 6 49 SBSPON chr8: 73976777- 0.70 1.631.23 1.17 5.00E−05 2.53E−03 somatomedin B and 74005507 thrombospondin,type 1 domain containing 50 LCN2 chr9: 130911731- 242.72 565.65 1.221.21 5.00E−05 2.53E−03 lipocalin 2 130915734 51 TRIB2 chr2: 12856997-3.63 8.27 1.19 1.48 5.00E−05 2.53E−03 tribbles pseudokinase 2 1288285852 PLA2G16 chr11: 63341943- 10.94 24.83 1.18 1.41 5.00E−05 2.53E−03phospholipase A2, group XVI 63381941 53 TPD52L1 chr6: 125474878- 6.2414.12 1.18 1.45 5.00E−05 2.53E−03 tumor protein D52-like 1 125584644 54CFB chr6_ssto_hap7: 12.33 27.89 1.18 1.67 5.00E−05 2.53E−03 complementfactor B 3246430- 3252571 55 TMEM92 chr17: 48348766- 4.61 10.25 1.151.43 5.00E−05 2.53E−03 transmembrane protein 92 48358846 56 CASP5 chr11:104864966- 10.67 23.11 1.11 1.29 5.00E−05 2.53E−03 caspase 5,apoptosis-related 104893895 cysteine peptidase 57 GPD1 chr12: 50497601-3.30 7.08 1.10 1.27 5.00E−05 2.53E−03 glycerol-3-phosphate 50505103dehydrogenase 1 (soluble) 58 VSIG1 chrX: 107288199- 13.32 28.55 1.101.09 1.60E−03 4.26E−02 V-set and immunoglobulin 107322414 domaincontaining 1 59 CCL22 chr16: 57392694- 0.89 1.90 1.09 0.91 7.00E−042.26E−02 chemokine (C-C motif) 57400102 ligand 22 60 TRNP1 chr1:27320194- 4.52 9.56 1.08 1.06 5.50E−04 1.86E−02 TMF1-regulated nuclear27327377 protein 1 61 LAMP3 chr3: 182840002- 0.97 2.04 1.08 0.999.50E−04 2.87E−02 lysosomal-associated membrane 182880667 protein 3 62PTAFR chr1: 28473676- 4.89 10.32 1.08 1.52 5.00E−05 2.53E−03platelet-activating factor 28520447 receptor 63 CNGA1 chr4: 47937993-1.36 2.84 1.06 1.06 8.00E−04 2.53E−02 cyclic nucleotide gated 48014961channel alpha 1 64 GZMA chr5: 54398473- 8.49 17.42 1.04 1.20 5.00E−052.53E−03 granzyme A (granzyme 1, 54406080 cytotoxic T-lymphocyte-associated serine esterase 3) 65 NXF3 chrX: 102330749- 3.67 7.47 1.031.11 2.00E−04 8.55E−03 nuclear RNA export factor 3 102348022 66 TMEM45Achr3: 100211462- 2.33 4.73 1.02 0.87 1.70E−03 4.44E−02 transmembraneprotein 45A 100296285 67 FKBP10 chr17: 39968961- 5.93 11.80 0.99 1.415.00E−05 2.53E−03 FK506 binding protein 10, 39979469 65 kDa 68 KIZchr20: 21106623- 2.73 5.40 0.98 1.17 2.00E−04 8.55E−03 kizunacentrosomal protein 21227258 69 TESC chr12: 117476727- 4.93 9.66 0.970.93 1.25E−03 3.53E−02 tescalcin 117537251 70 ZNF488 chr10: 48355088-0.83 1.62 0.96 0.88 1.50E−03 4.05E−02 zinc finger protein 488 4837386671 GOLGA7B chr10: 99609994- 0.79 1.53 0.96 0.97 8.50E−04 2.66E−02 golginA7 family, member B 99790585 72 CCL2 chr17: 32582295- 11.11 21.24 0.931.04 2.00E−04 8.55E−03 chemokine (C-C motif) 32584220 ligand 2 73 HBBchr11: 5246695- 18.21 34.40 0.92 0.98 5.50E−04 1.86E−02 hemoglobin, beta5248301 74 GALNT6 chr12: 51745832- 6.65 12.55 0.92 1.32 5.00E−052.53E−03 polypeptide N-acetyl- 51909547 galactosaminyltransferase 6 75BIRC3 chr11: 102188180- 10.86 20.43 0.91 1.30 5.00E−05 2.53E−03baculoviral IAP repeat 102210135 containing 3 76 GBP2 chr1: 89571815-13.55 25.31 0.90 1.51 5.00E−05 2.53E−03 guanylate binding protein 2,89591842 interferon-inducible 77 SEC16B chr1: 177898241- 3.65 6.81 0.901.22 5.00E−05 2.53E−03 SEC16 homolog B, endoplasmic 177939050 reticulumexport factor 78 EPSTI1 chr13: 43460523- 4.24 7.85 0.89 1.26 5.00E−052.53E−03 epithelial stromal 43566407 interaction 1 (breast) 79 XAF1chr17: 6659155- 7.25 13.42 0.89 1.36 5.00E−05 2.53E−03 XIAP associatedfactor 1 6678964 80 GBP1 chr1: 89517986- 6.29 11.36 0.85 1.12 1.00E−044.83E−03 guanylate binding protein 1, 89531043 interferon-inducible 81EVPL chr17: 74002926- 7.91 14.27 0.85 1.41 5.00E−05 2.53E−03 envoplakin74023507 82 IFIT3 chr10: 91087575- 3.66 6.60 0.85 1.06 2.00E−04 8.55E−03interferon-induced protein 91100725 with tetratricopeptide repeats 3 83KIFC3 chr16: 57792128- 2.83 5.07 0.84 1.03 6.50E−04 2.11E−02 kinesinfamily member C3 57836439 84 RAB27B chr18: 52495707- 4.69 8.31 0.82 1.395.00E−05 2.53E−03 RAB27B, member RAS oncogene 52562747 family 85 SPIRE1chr18: 12446510- 2.19 3.87 0.82 1.09 1.50E−04 6.74E−03 spire-type actinnucleation 12657912 factor 1 86 TBX3 chr12: 115108058- 5.34 9.19 0.781.11 5.00E−05 2.53E−03 T-box 3 115121969 87 OAS2 chr12: 113416273- 5.118.70 0.77 1.02 5.50E−04 1.86E−02 2′-5′-oligoadenylate 113449528synthetase 2, 69/71 kDa 88 YBX2 chr17: 7191570- 9.82 16.73 0.77 0.919.00E−04 2.76E−02 Y box binding protein 2 7197876 89 DCDC2 chr6:24171982- 2.88 4.90 0.77 0.97 1.50E−03 4.05E−02 doublecortin domain24383520 containing 2 90 MX1 chr21: 42792484- 10.08 16.99 0.75 1.095.00E−05 2.53E−03 MX dynamin-like GTPase 1 42831141 91 UNC5CL chr6:40994639- 4.00 6.60 0.72 0.93 9.00E−04 2.76E−02 unc-5 family C-terminallike 41006938 92 IFI6 chr1: 27992571- 29.98 49.49 0.72 0.95 1.15E−033.29E−02 interferon, alpha-inducible 27998724 protein 6 93 CROT chr7:86974950- 6.33 10.43 0.72 0.99 1.15E−03 3.29E−02 carnitineO-octanoyltransferase 87029112 94 SAMD9L chr7: 92759367- 6.11 10.00 0.711.10 5.00E−05 2.53E−03 sterile alpha motif domain 92777680 containing9-like 95 TNFSF15 chr9: 117546914- 6.06 9.90 0.71 1.03 3.50E−04 1.32E−02tumor necrosis factor (ligand) 117568408 superfamily, member 15 96PRUNE2 chr9: 79226291- 1.24 2.02 0.71 1.00 2.50E−04 1.03E−02 prunehomolog 2 (Drosophila) 79521003 97 ADGRG6 chr6: 142623055- 9.79 15.850.70 1.07 6.50E−04 2.11E−02 adhesion G protein-coupled 142767403receptor G6 98 ANO1 chr11: 69924407- 5.25 8.45 0.69 0.99 4.50E−041.62E−02 anoctamin 1, calcium activated 70035652 chloride channel 99ERO1A chr14: 53108604- 43.91 70.11 0.68 1.10 6.00E−04 1.99E−02endoplasmic reticulum 53162419 oxidoreductase alpha 100 SLC7A7 chr14:23242431- 7.94 12.66 0.67 0.92 1.75E−03 4.53E−02 solute carrier family 7(amino 23289020 acid transporter light chain, y + L system), member 7101 PIK3R3 chr1: 46505811- 2.63 4.17 0.67 0.91 1.55E−03 4.17E−02phosphoinositide-3-kinase, 46598708 regulatory subunit 3 (gamma) 102CA13 chr8: 86157715- 4.69 7.42 0.66 0.94 1.10E−03 3.22E−02 carbonicanhydrase XIII 86196302 103 RHOBTB1 chr10: 62629197- 6.43 10.16 0.660.95 8.00E−04 2.53E−02 Rho-related BTB domain 62761198 containing 1 104IRS2 chr13: 110406183- 9.42 14.73 0.65 1.04 1.50E−04 6.74E−03 insulinreceptor substrate 2 110438914 105 MYRF chr11: 61520120- 13.30 20.750.64 0.99 5.00E−04 1.75E−02 myelin regulatory factor 61555989 106 IFI27chr14: 94577078- 715.46 1115.84 0.64 0.92 9.50E−04 2.87E−02 interferon,alpha-inducible 94583036 protein 27 107 FSIP2 chr2: 186584600- 1.29 2.010.64 1.09 1.15E−03 3.29E−02 fibrous sheath interacting 186698016 protein2 108 LMO4 chr1: 87794150- 7.65 11.83 0.63 1.00 6.50E−04 2.11E−02 LIMdomain only 4 87814607 109 IMPDH1 chr7: 128032330- 13.26 20.52 0.63 0.959.50E−04 2.87E−02 IMP (inosine 5′-monophosphate) 128050036 dehydrogenase1 110 APOL2 chr22: 36622254- 16.40 25.15 0.62 0.96 5.00E−04 1.75E−02apolipoprotein L, 2 36636000 111 PARP14 chr3: 122399671- 10.79 16.240.59 0.93 9.00E−04 2.76E−02 poly (ADP-ribose) polymerase 122449687family, member 14 112 SLC5A1 chr22: 32439018- 12.26 18.42 0.59 0.911.30E−03 3.64E−02 solute carrier family 5 32509011 (sodium/glucosecotransporter), member 1 113 VSIG2 chr11: 124617369- 282.20 196.63 −0.52−0.84 1.60E−03 4.26E−02 V-set and immunoglobulin domain 124622109containing 2 114 C11orf86 chr11: 66742753- 101.24 68.68 −0.56 −0.911.05E−03 3.10E−02 chromosome 11 open reading 66744479 frame 86 115 AK1chr9: 130628758- 66.42 44.84 −0.57 −0.95 3.00E−04 1.18E−02 adenylatekinase 1 130640022 116 AIM1L chr1: 26648349- 13.43 8.94 −0.59 −0.974.50E−04 1.62E−02 absent in melanoma 1 -like 26680621 117 CHFR chr12:133416937- 13.81 8.96 −0.62 −0.93 9.50E−04 2.87E−02 checkpoint withforkhead and 133464204 ring finger domains, E3 ubiquitin protein ligase118 IL2RG chrX: 70327253- 184.79 116.15 −0.67 −1.00 1.00E−04 4.83E−03interleukin 2 receptor, gamma 70331481 119 CSRNP1 chr3: 39183341- 13.007.76 −0.75 −1.08 5.00E−05 2.53E−03 cysteine-serine-rich nuclear 39195102protein 1 120 RASL11A chr13: 27844463- 28.05 16.35 −0.78 −1.14 5.00E−052.53E−03 RAS-like, family 11, member A 27847827 121 TRIM40chr6_ssto_hap7: 8.23 4.75 −0.79 −0.95 1.15E−03 3.29E−02 tripartite motifcontaining 40 1434344- 1446965 122 CCND1 chr11: 69455872- 35.15 20.10−0.81 −1.29 5.00E−05 2.53E−03 cyclin D1 69469242 123 NEU4 chr2:242750159- 22.57 11.83 −0.93 −1.38 2.00E−04 8.55E−03 sialidase 4242758739 124 JUNB chr19: 12902309- 155.87 81.48 −0.94 −1.35 5.00E−052.53E−03 jun B proto-oncogene 12904125 125 SFRP1 chr8: 41119475- 1.991.03 −0.95 −0.91 1.30E−03 3.64E−02 secreted frizzled-related 41166990protein 1 126 WSCD1 chr17: 5973933- 8.34 4.32 −0.95 −1.34 5.00E−052.53E−03 WSC domain containing 1 6027747 127 SHROOM2 chrX: 9754495- 1.280.65 −0.98 −0.98 1.05E−03 3.10E−02 shroom family member 2 9917481 128RNASE1 chr14: 21269514- 389.26 195.05 −1.00 −1.30 5.00E−05 2.53E−03ribonuclease, RNase A family, 1 21271036 (pancreatic) 129 C2orf54 chr2:241825464- 3.81 1.89 −1.01 −1.00 1.20E−03 3.40E−02 chromosome 2 openreading frame 241835573 54 130 CPE chr4: 166300096- 17.08 8.21 −1.06−1.48 5.00E−05 2.53E−03 carboxypeptidase E 166419482 131 SLC6A14 chrX:115567746- 24.17 10.92 −1.15 −1.67 5.00E−05 2.53E−03 solute carrierfamily 6 (amino 115592625 acid transporter), member 14 132 HBEGF chr5:139712427- 26.40 11.52 −1.20 −1.50 5.00E−05 2.53E−03 heparin-bindingEGF-like growth 139726188 factor 133 BAMBI chr10: 28966423- 20.28 8.51−1.25 −1.68 5.00E−05 2.53E−03 BMP and activin membrane-bound 28971868inhibitor 134 DPP10 chr2: 115199898- 1.39 0.58 −1.27 −0.98 1.80E−034.63E−02 dipeptidyl-peptidase 10 (non- 116602326 functional) 135 CHGAchr14: 93389444- 219.86 71.16 −1.63 −1.87 5.00E−05 2.53E−03 chromograninA 93401638 136 NEUROD1 chr2: 182540832- 1.56 0.45 −1.79 −1.32 5.00E−052.53E−03 neuronal differentiation 1 182545392 137 VLDLR chr9: 2535654-7.39 2.12 −1.80 −1.94 5.00E−05 2.53E−03 very low density lipoprotein2654485 receptor 138 RPPH1 chr14: 20811229- 123.42 30.35 −2.02 −1.322.00E−04 8.55E−03 ribonuclease P RNA component 20811570 H1 139 NTRK2chr9: 87283372- 3.92 0.78 −2.32 −2.35 5.00E−05 2.53E−03 neurotrophictyrosine kinase, 87641985 receptor, type 2

TABLE 5 The list of 134 genes exclusively DE between HP and SSA/P. meangene locus mean_HP SSA/P log₂FC test_stat p_value p_(adj) _(—) valueDescription 1 MIR4800 chr4: 2249159- 0.00 37.38 Inf NA 1.45E−03 3.98E−02microRNA 4800 2263739 2 KLK8 chr19: 51499263- 0.17 4.74 4.83 1.361.65E−03 4.36E−02 kallikrein-related peptidase 8 51504958 3 DUSP27 chr1:167064086- 0.36 6.91 4.25 2.54 5.00E−05 2.53E−03 dual specificityphosphatase 27 167098402 (putative) 4 ANXA10 chr4: 169013687- 6.18 51.853.07 3.11 5.00E−05 2.53E−03 annexin A10 169108893 5 CLDN18 chr3:137717657- 0.21 1.37 2.73 1.73 8.50E−04 2.66E−02 claudin 18 137752494 6NMUR2 chr5: 151771101- 0.26 1.59 2.64 1.34 5.00E−05 2.53E−03 neuromedinU receptor 2 151784840 7 HLA-DQB1 chr6_qbl_hap6: 4.59 23.80 2.37 2.455.00E−05 2.53E−03 major histocompatibility complex, 3858964- class II,DQ beta 1 3866565 8 FEZF1-AS1 chr7: 121941447- 0.67 3.11 2.21 1.675.00E−05 2.53E−03 FEZF1 antisense RNA 1 121950131 9 SULT1C2P1 chr2:108938693- 0.25 1.17 2.20 1.25 4.50E−04 1.62E−02 sulfotransferasefamily, cytosolic, 108970254 1C, member 2 pseudogene 1 10 SLCO1B3 chr12:20963637- 0.43 1.92 2.16 1.77 1.00E−04 4.83E−03 solute carrier organicanion 21069843 transporter family, member 1B3 11 TDO2 chr4: 156824844-0.51 1.64 1.69 1.29 5.00E−05 2.53E−03 tryptophan 2,3-dioxygenase156841558 12 TCN1 chr11: 59620280- 2.32 7.25 1.64 1.58 5.00E−05 2.53E−03transcobalamin I (vitamin B12 59634041 binding protein, R binder family)13 CCL8 chr17: 32646065- 1.47 4.41 1.59 1.38 5.00E−05 2.53E−03 chemokine(C-C motif) ligand 8 32648421 14 NAT8B chr2: 73927635- 1.88 5.33 1.511.11 5.00E−05 2.53E−03 N-acetyltransferase 8B (GCN5- 73928467 related,putative, gene/pseudogene) 15 PSCA chr8: 143751725- 11.23 31.23 1.481.58 5.00E−05 2.53E−03 prostate stem cell antigen 143764145 16 LRRIQ4chr3: 169539709- 0.46 1.26 1.45 1.07 1.50E−04 6.74E−03 leucine-richrepeats and IQ motif 169555560 containing 4 17 CTHRC1 chr8: 104383742-0.85 2.31 1.44 1.09 2.50E−04 1.03E−02 collagen triple helix repeat104395232 containing 1 18 SRMS chr20: 62171276- 0.95 2.42 1.36 1.275.00E−05 2.53E−03 src-related kinase lacking C- 62178857 terminalregulatory tyrosine and N- terminal myristylation sites 19 MMP10 chr11:102641232- 0.56 1.41 1.33 0.94 7.50E−04 2.40E−02 matrix metallopeptidase10 102651359 20 DMBT1 chr10: 124320180- 4.66 11.35 1.29 1.48 5.00E−052.53E−03 deleted in malignant brain tumors 1 124403252 21 TRHDE chr12:72647286- 1.17 2.85 1.28 1.46 5.00E−05 2.53E−03 thyrotropin-releasinghormone 73059422 degrading enzyme 22 SLC52A1 chr17: 4935896- 0.51 1.231.27 0.93 5.50E−04 1.86E−02 solute carrier family 52 (riboflavin 4938727transporter), member 1 23 PRAP1 chr10: 135160843- 81.41 188.04 1.21 1.885.00E−05 2.53E−03 proline-rich acidic protein 1 135166187 24 C4orf48chr4: 2043719- 21.05 48.12 1.19 1.18 1.50E−04 6.74E−03 chromosome 4 openreading frame 2045697 48 25 CD244 chr1: 160799949- 0.59 1.24 1.07 0.891.50E−03 4.05E−02 CD244 molecule, natural killer cell 160832692 receptor2B4 26 S100A9 chr1: 153330329- 8.46 17.45 1.04 1.00 2.50E−04 1.03E−02S100 calcium binding protein A9 153333503 27 RBBP8NL chr20: 60985292-1.77 3.63 1.03 1.22 5.00E−05 2.53E−03 RBBP8 N-terminal like 61002629 28MNDA chr1: 158801167- 1.72 3.32 0.95 0.95 5.50E−04 1.86E−02 myeloid cellnuclear differentiation 158819270 antigen 29 PRKXP1 chr15: 101087956-0.71 1.36 0.94 1.02 5.00E−04 1.75E−02 protein kinase, X-linked,101099488 pseudogene 1 30 NXPE2 chr11: 114549199- 2.50 4.66 0.90 0.951.05E−03 3.10E−02 neurexophilin and PC-esterase 114577652 domain family,member 2 31 KLRB1 chr12: 9747869- 11.08 20.55 0.89 1.07 2.00E−048.55E−03 killer cell lectin-like receptor 9760497 subfamily B, member 132 TRIP6 chr7: 100464949- 12.10 22.25 0.88 1.21 5.00E−05 2.53E−03thyroid hormone receptor interactor 100471076 6 33 C3 chr19: 6677845-6.29 11.51 0.87 1.21 5.00E−05 2.53E−03 complement component 3 6720662 34PEAR1 chr1: 156863522- 1.25 2.28 0.87 0.92 9.00E−04 2.76E−02 plateletendothelial aggregation 156886226 receptor 1 35 ANK3 chr10: 61786055-4.83 8.57 0.83 1.30 5.00E−05 2.53E−03 ankyrin 3, node of Ranvier(ankyrin 62493284 G) 36 PRKCDBP chr11: 6340175- 6.23 11.03 0.82 0.961.80E−03 4.63E−02 protein kinase C, delta binding 6341740 protein 37CSF3R chr1: 36931643- 1.10 1.94 0.82 0.89 1.70E−03 4.44E−02 colonystimulating factor 3 receptor 36948915 (granulocyte) 38 DDX60 chr4:169137441- 10.72 18.82 0.81 1.25 5.00E−05 2.53E−03 DEAD(Asp-Glu-Ala-Asp) box 169239958 polypeptide 60 39 GPT2 chr16: 46918307-7.21 12.44 0.79 1.16 1.00E−04 4.83E−03 glutamic pyruvate transaminase46965201 (alanine aminotransferase) 2 40 SMOC2 chr6: 168841830- 4.507.76 0.78 1.00 1.50E−04 6.74E−03 SPARC related modular calcium 169068674binding 2 41 MX2 chr21: 42733949- 2.27 3.87 0.77 0.89 1.15E−03 3.29E−02MX dynamin-like GTPase 2 42780869 42 FUOM chr10: 135168657- 20.79 35.240.76 0.90 1.65E−03 4.36E−02 fucose mutarotase 135171529 43 PLEKHH2 chr2:43864438- 1.35 2.24 0.73 0.94 9.00E−04 2.76E−02 pleckstrin homologydomain 43995126 containing, family H (with MyTH4 domain) member 2 44PRRT2 chr16: 29823408- 8.71 14.40 0.72 1.02 1.50E−04 6.74E−03proline-rich transmembrane protein 29827202 2 45 CCDC88A chr2: 55514977-0.98 1.59 0.70 0.95 1.20E−03 3.40E−02 coiled-coil domain containing 88A55647057 46 ENPP1 chr6: 132129155- 2.38 3.87 0.70 1.04 2.00E−04 8.55E−03ectonucleotide 132216295 pyrophosphatase/ phosphodiesterase 1 47 LCP1chr13: 46700057- 17.79 28.70 0.69 1.09 5.00E−05 2.53E−03 lymphocytecytosolic protein 1 (L- 46756459 plastin) 48 DISP2 chr15: 40650433- 6.099.75 0.68 1.00 4.50E−04 1.62E−02 dispatched homolog 2 (Drosophila)40663256 49 ATHL1 chr11: 289137- 10.02 15.98 0.67 1.05 2.00E−04 8.55E−03ATH1, acid trehalase-like 1 (yeast) 295688 50 LOC730102 chr1: 177975274-12.59 19.86 0.66 1.06 2.00E−04 8.55E−03 quinone oxidoreductase-like178007142 protein 2 pseudogene 51 ACE2 chrX: 15579155- 8.39 13.02 0.630.98 7.00E−04 2.26E−02 angiotensin I converting enzyme 2 15620192 52ABCA7 chr19: 1040101- 4.22 6.52 0.63 1.02 3.00E−04 1.18E−02 ATP-bindingcassette, sub-family A 1065570 (ABC1), member 7 53 PTPRC chr1:198608097- 9.24 14.12 0.61 0.97 8.00E−04 2.53E−02 protein tyrosinephosphatase, 198726605 receptor type, C 54 LAMB2 chr3: 49158546- 9.7014.52 0.58 0.92 6.50E−04 2.11E−02 laminin, beta 2 (laminin S) 4917059955 ITGAV chr2: 187454789- 13.14 19.55 0.57 0.92 1.60E−03 4.26E−02integrin, alpha V 187545629 56 CNNM4 chr2: 97426638- 62.68 43.69 −0.52−0.85 1.55E−03 4.17E−02 cyclin and CBS domain divalent 97477628 metalcation transport mediator 4 57 CYP2J2 chr1: 60358979- 29.69 20.41 −0.54−0.88 1.50E−03 4.05E−02 cytochrome P450, family 2, 60392423 subfamily J,polypeptide 2 58 VDR chr12: 48235319- 47.70 32.51 −0.55 −0.91 1.75E−034.53E−02 vitamin D (1,25-dihydroxyvitamin 48298814 D3) receptor 59KCTD10 chr12: 109886459- 30.87 20.82 −0.57 −0.95 3.00E−04 1.18E−02potassium channel tetramerization 109915155 domain containing 10 60NEDD9 chr6: 11183530- 30.10 20.30 −0.57 −0.82 1.15E−03 3.29E−02 neuralprecursor cell expressed, 11382581 developmentally down-regulated 9 61PIM1 chr6: 37137921- 45.23 30.41 −0.57 −0.91 1.15E−03 3.29E−02 Pim-1proto-oncogene, 37143204 serine/threonine kinase 62 SIRT6 chr19:4174105- 67.88 45.43 −0.58 −1.01 5.00E−04 1.75E−02 sirtuin 6 4182596 63MUC20 chr3: 195447752- 63.08 42.21 −0.58 −0.92 1.60E−03 4.26E−02 mucin20, cell surface associated 195460424 64 TMEM98 chr17: 31254927- 87.8058.68 −0.58 −0.97 3.00E−04 1.18E−02 transmembrane protein 98 31268667 65CLCN2 chr3: 184053716- 37.60 25.13 −0.58 −0.93 1.65E−03 4.36E−02chloride channel, voltage-sensitive 184079439 2 66 LEFTY1 chr1:226073981- 62.02 40.94 −0.60 −0.99 5.50E−04 1.86E−02 left-rightdetermination factor 1 226076846 67 TBC1D22B chr6: 37179953- 9.73 6.42−0.60 −0.86 1.75E−03 4.53E−02 TBC1 domain family, member 22B 37300746 68ID1 chr20: 30193085- 236.63 154.07 −0.62 −1.05 6.00E−04 1.99E−02inhibitor of DNA binding 1, 30194317 dominant negative helix-loop-helixprotein 69 MIDN chr19: 1248551- 61.35 39.86 −0.62 −1.04 1.00E−044.83E−03 midnolin 1259142 70 JUND chr19: 18390503- 142.42 92.25 −0.63−1.09 4.00E−04 1.49E−02 jun D proto-oncogene 18392466 71 PDGFA chr7:536896- 25.93 16.68 −0.64 −1.01 3.50E−04 1.32E−02 platelet-derivedgrowth factor alpha 559481 polypeptide 72 DNTTIP1 chr20: 44420575- 29.9919.22 −0.64 −0.99 3.00E−04 1.18E−02 deoxynucleotidyltransferase,44440066 terminal, interacting protein 1 73 SH2B3 chr12: 111843751-11.65 7.41 −0.65 −0.98 3.50E−04 1.32E−02 SH2B adaptor protein 3111889427 74 HSPA1A chr6_qbl_hap6: 18.87 11.74 −0.68 −1.12 1.50E−046.74E−03 heat shock 70 kDa protein 1A 3076937- 3079366 75 ZFP36 chr19:39897486- 149.93 93.12 −0.69 −0.96 7.50E−04 2.40E−02 ZFP36 ring fingerprotein 39900052 76 AQP3 chr9: 33441151- 11.73 7.24 −0.70 −0.88 1.35E−033.76E−02 aquaporin 3 (Gill blood group) 33447631 77 DDAH2chr6_ssto_hap7: 87.31 53.77 −0.70 −1.03 2.50E−04 1.03E−02dimethylarginine 3025633- dimethylaminohydrolase 2 3028856 78 SERTAD1chr19: 40928408- 33.04 20.20 −0.71 −1.07 1.50E−04 6.74E−03 SERTA domaincontaining 1 40931932 79 CD14 chr5: 140011312- 62.69 38.02 −0.72 −1.112.00E−04 8.55E−03 CD14 molecule 140013286 80 LINC00675 chr17: 10616638-54.21 32.59 −0.73 −1.01 1.50E−04 6.74E−03 long intergenic non-proteincoding 10718481 RNA 675 81 CLDN4 chr7: 73245192- 720.19 432.86 −0.73−0.94 1.70E−03 4.44E−02 claudin 4 73247023 82 DENND2A chr7: 140218219-4.92 2.96 −0.74 −0.91 1.45E−03 3.98E−02 DENN/MADD domain containing140302342 2A 83 PDE9A chr21: 44073861- 53.48 31.85 −0.75 −0.99 5.50E−041.86E−02 phosphodiesterase 9A 44195618 84 PC chr11: 66615996- 18.5911.05 −0.75 −0.94 1.15E−03 3.29E−02 pyruvate carboxylase 66725847 85 DESchr2: 220283098- 20.13 11.91 −0.76 −0.93 6.00E−04 1.99E−02 desmin220291461 86 BTG2 chr1: 203274663- 77.53 45.56 −0.77 −1.00 1.05E−033.10E−02 BTG family, member 2 203278729 87 AVPI1 chr10: 99437180- 36.6221.08 −0.80 −1.22 5.00E−05 2.53E−03 arginine vasopressin-induced 199447015 88 EMB chr5: 49692030- 8.04 4.61 −0.80 −1.24 5.00E−05 2.53E−03em bigin 49737234 89 KLF4 chr9: 110247132- 197.26 111.42 −0.82 −1.255.00E−05 2.53E−03 Kruppel-like factor 4 (gut) 110252047 90 IER2 chr19:13261281- 109.13 61.47 −0.83 −1.30 5.00E−05 2.53E−03 immediate earlyresponse 2 13265718 91 TPPP3 chr16: 67423709- 9.48 5.32 −0.83 −0.931.90E−03 4.85E−02 tubulin polymerization-promoting 67427438 proteinfamily member 3 92 ILDR1 chr3: 121706169- 10.99 6.12 −0.84 −1.225.00E−05 2.53E−03 immunoglobulin-like domain 121741127 containingreceptor 1 93 TPPP chr5: 659976- 3.08 1.69 −0.87 −1.12 5.00E−05 2.53E−03tubulin polymerization promoting 693510 protein 94 WNT5B chr12: 1726221-5.41 2.90 −0.90 −0.97 9.00E−04 2.76E−02 wingless-type MMTV integration1756378 site family, member 5B 95 N4BP3 chr5: 177540555- 1.68 0.90 −0.90−1.00 6.50E−04 2.11E−02 NEDD4 binding protein 3 177553107 96 GREM1chr15: 33010204- 5.55 2.96 −0.91 −1.13 1.50E−04 6.74E−03 gremlin 1, DANfamily BMP 33026870 antagonist 97 NUPR1 chr16: 28548661- 68.19 35.67−0.93 −1.19 5.00E−05 2.53E−03 nuclear protein, transcriptional 28550495regulator, 1 98 ADAMTS1 chr21: 28208605- 3.84 2.01 −0.94 −1.13 5.00E−052.53E−03 ADAM metallopeptidase with 28217728 thrombospondin type 1motif, 1 99 FRAS1 chr4: 78978723- 0.97 0.49 −0.98 −1.17 5.00E−052.53E−03 Fraser extracellular matrix complex 79465423 subunit 1 100 IER3chr6_ssto_hap7: 104.82 52.52 −1.00 −1.40 5.00E−05 2.53E−03 immediateearly response 3 2043292- 2044644 101 RHOB chr2: 20646834- 57.49 28.46−1.01 −1.42 5.00E−05 2.53E−03 ras homolog family member B 20649201 102CAP2 chr6: 17393735- 1.18 0.58 −1.02 −0.89 1.20E−03 3.40E−02 CAP,adenylate cyclase-associated 17558023 protein, 2 (yeast) 103 P3H2 chr3:189674516- 18.52 9.08 −1.03 −1.49 5.00E−05 2.53E−03 prolyl 3-hydroxylase2 189840226 104 MTRNR2L1 chr17: 22022436- 157.37 76.54 −1.04 −1.445.00E−05 2.53E−03 MT-RNR2-like 1 22023991 105 MST1P2 chr1: 16972068-3.54 1.68 −1.08 −1.03 6.00E−04 1.99E−02 macrophage stimulating 116976915 (hepatocyte growth factor-like) pseudogene 2 106 CELSR1 chr22:46756730- 2.16 0.98 −1.13 −1.49 5.00E−05 2.53E−03 cadherin, EGF LAGseven-pass G- 46933067 type receptor 1 107 LYPD6B chr2: 149894980- 3.691.67 −1.14 −1.02 5.00E−05 2.53E−03 LY6/PLAUR domain containing 6B150071772 108 ZNF334 chr20: 45128268- 1.02 0.44 −1.22 −0.94 9.50E−042.87E−02 zinc finger protein 334 45142198 109 FAR2P2 chr2: 131174325-0.94 0.39 −1.28 −0.98 8.50E−04 2.66E−02 fatty acyl CoA reductase 2131186119 pseudogene 2 110 SOX8 chr16: 1031807- 1.64 0.67 −1.29 −1.071.50E−03 4.05E−02 SRY (sex determining region Y)- 1036979 box 8 111ITLN1 chr1: 160846329- 592.60 242.61 −1.29 −1.36 5.00E−05 2.53E−03intelectin 1 (galactofuranose 160854960 binding) 112 RRAD chr16:66955581- 2.59 1.06 −1.29 −0.96 1.40E−03 3.88E−02 Ras-related associatedwith 66959439 diabetes 113 VSTM2L chr20: 36531498- 1.24 0.48 −1.36 −0.971.10E−03 3.22E−02 V-set and transmembrane domain 36573747 containing 2like 114 GP2 chr16: 20321810- 3.44 1.22 −1.50 −1.37 5.00E−05 2.53E−03glycoprotein 2 (zymogen granule 20338835 membrane) 115 PPP1R3G chr6:5085719- 3.67 1.28 −1.52 −1.27 1.50E−04 6.74E−03 protein phosphatase 1,regulatory 5087455 subunit 3G 116 PEG10 chr7: 94285636- 1.38 0.47 −1.57−1.39 5.00E−05 2.53E−03 paternally expressed 10 94299006 117 BEX1 chrX:102317580- 2.45 0.81 −1.60 −0.95 1.60E−03 4.26E−02 brain expressed,X-linked 1 102319168 118 SLC18A1 chr8: 20002365- 2.63 0.86 −1.61 −1.435.00E−05 2.53E−03 solute carrier family 18 (vesicular 20040717 monoaminetransporter), member 1 119 KLK15 chr19: 51328544- 7.78 2.54 −1.62 −1.525.00E−05 2.53E−03 kallikrein-related peptidase 15 51334779 120 RFX6chr6: 117198375- 0.93 0.26 −1.84 −1.38 5.00E−05 2.53E−03 regulatoryfactor X, 6 117253326 121 SCG3 chr15: 51973549- 1.07 0.29 −1.90 −1.265.00E−04 1.75E−02 secretogranin III 52013223 122 TTR chr18: 29171729-5.66 1.50 −1.92 −1.26 5.00E−05 2.53E−03 transthyretin 29178986 123 PYY2chr17: 26553588- 1.76 0.46 −1.95 −1.05 1.85E−03 4.74E−02 peptide YY, 2(pseudogene) 26555085 124 RUNDC3A chr17: 42385926- 1.13 0.29 −1.97 −1.224.50E−04 1.62E−02 RUN domain containing 3A 42396038 125 HOXD9 chr2:176987412- 10.70 2.60 −2.04 −2.05 5.00E−05 2.53E−03 homeobox D9176989645 126 TMPRSS6 chr22: 37461475- 1.40 0.34 −2.05 −1.68 5.00E−052.53E−03 transmembrane protease, serine 6 37505603 127 NCCRP1 chr19:39687603- 1.35 0.32 −2.10 −1.20 3.00E−04 1.18E−02 non-specific cytotoxiccell receptor 39692522 protein 1 homolog (zebrafish) 128 CRYBA2 chr2:219854911- 7.02 1.54 −2.19 −1.32 3.00E−04 1.18E−02 crystallin, beta A2219858127 129 NKX2-2 chr20: 21491659- 0.96 0.21 −2.22 −1.30 3.50E−041.32E−02 NK2 homeobox 2 21494664 130 LOC441454 chr9: 99671356- 1.71 0.36−2.24 −1.27 1.00E−04 4.83E−03 prothymosin, alpha pseudogene 99672737 131HOXD10 chr2: 176981491- 10.02 1.72 −2.55 −2.52 5.00E−05 2.53E−03homeobox D10 176984670 132 OR51E2 chr11: 4701400- 6.33 0.66 −3.26 −2.305.00E−05 2.53E−03 olfactory receptor, family 51, 4719076 subfamily E,member 2 133 COL2A1 chr12: 48366747- 1.36 0.14 −3.32 −2.62 5.00E−052.53E−03 collagen, type II, alpha 1 48398285 134 ALDH1A2 chr15:58245621- 1.13 0.10 −3.49 −1.81 1.00E−03 3.00E−02 aldehyde dehydrogenase1 family, 58358121 member A2

TABLE 6 The list of 1058 genes exclusively DE between CR and SSA/Psamples. mean gene locus mean_CR SSA/P log₂FC test_stat p_value p_(adj)_(—) value Description 1 KLK6 chr19: 51461886- 0.00 2.31 Inf NA 5.00E−057.97E−04 kallikrein-related peptidase 6 51472929 2 CASC19 chr8:128200030- 0.00 4.02 Inf NA 5.00E−05 7.97E−04 cancer susceptibilitycandidate 19 128209872 (non-protein coding) 3 MIR4687 chr11: 3876932-0.00 12.04 Inf NA 2.50E−04 3.38E−03 microRNA 4687 4114440 4 MIR6727chr1: 1243993- 0.00 378.27 Inf NA 3.50E−04 4.55E−03 microRNA 67271260067 5 FAM25A chr10: 88780045- 0.06 6.54 6.69 1.06 1.00E−04 1.51E−03family with sequence similarity 25, 88784487 member A 6 HTR1D chr1:23518387- 0.13 10.29 6.28 4.15 5.00E−05 7.97E−04 5-hydroxytryptamine(serotonin) 23521222 receptor 1D, G protein-coupled 7 CDH3 chr16:68678150- 0.14 9.02 5.97 3.87 5.00E−05 7.97E−04 cadherin 3, type 1,P-cadherin 68732957 (placental) 8 PIWIL1 chr12: 130822432- 0.05 2.215.50 2.91 5.00E−05 7.97E−04 piwi-like RNA-mediated gene 130856877silencing 1 9 AFAP1-AS1 chr4: 7755816- 0.03 1.31 5.29 0.86 1.50E−042.18E−03 AFAP1 antisense RNA 1 7941653 10 PRSS22 chr16: 2902727- 0.6825.53 5.24 4.31 5.00E−05 7.97E−04 protease, serine, 22 2908171 11 EPHX4chr1: 92495532- 0.19 6.74 5.13 3.11 5.00E−05 7.97E−04 epoxide hydrolase4 92529093 12 KRT7 chr12: 52626953- 0.22 7.17 5.05 2.93 5.00E−057.97E−04 keratin 7, type II 52642709 13 KLHL30 chr2: 239047362- 0.051.60 4.97 2.58 5.00E−05 7.97E−04 kelch-like family member 30 23906154714 MYBPC1 chr12: 101988708- 0.11 3.16 4.90 3.30 5.00E−05 7.97E−04 myosinbinding protein C, slow 102079658 type 15 SAA1 chr11: 18287807- 0.288.21 4.85 1.84 6.00E−04 7.22E−03 serum amyloid A1 18291523 16 CXCL11chr4: 76932332- 0.17 4.70 4.79 1.59 5.75E−03 4.55E−02 chemokine (C-X-Cmotif) 77033955 ligand 11 17 SH3PXD2A- chr10: 105353783- 0.29 8.08 4.780.81 5.00E−05 7.97E−04 SH3PXD2A antisense RNA 1 AS1 105615164 18 SFTA2chr_6qbl_hap6: 0.07 1.79 4.75 0.95 5.00E−05 7.97E−04 surfactantassociated 2 2192097- 2192923 19 DUSP4 chr8: 29190578- 0.40 9.96 4.624.75 5.00E−05 7.97E−04 dual specificity phosphatase 4 29208267 20 DUOXA1chr15: 45406522- 0.69 15.69 4.51 1.68 5.00E−05 7.97E−04 dual oxidasematuration factor 1 45422075 21 CRNDE chr16: 54952776- 0.19 3.75 4.321.74 3.00E−04 3.97E−03 colorectal neoplasia differentially 54963101expressed (non-protein coding) 22 SAA2 chr11: 18252901- 0.22 4.23 4.281.90 9.00E−04 1.01E−02 serum amyloid A2 18270221 23 CCAT1 chr8:128219626- 0.14 2.62 4.26 2.62 5.00E−05 7.97E−04 colon cancer associatedtranscript 128231513 1 (non-protein coding) 24 PDX1 chr13: 28403895-0.24 4.61 4.24 3.12 5.00E−05 7.97E−04 pancreatic and duodenal 28500451homeobox 1 25 SSTR5 chr16: 1114081- 0.09 1.64 4.13 1.45 6.00E−047.22E−03 somatostatin receptor 5 1131454 26 LINC00520 chr14: 56247852-0.56 8.76 3.97 3.32 5.00E−05 7.97E−04 long intergenic non-protein coding56263392 RNA520 27 KRT80 chr12: 52562779- 0.08 1.28 3.92 2.51 5.00E−057.97E−04 keratin 80, type II 52585784 28 UGT1A3 chr2: 234526290- 0.212.90 3.75 0.34 7.50E−04 8.64E−03 UDP glucuronosyltransferase 1 234681951family, polypeptide A3 29 F0SL1 chr11: 65659691- 0.18 2.46 3.74 2.345.00E−05 7.97E−04 FOS-like antigen 1 65667997 30 C11orf91 chr11:33719653- 0.21 2.65 3.64 1.81 3.50E−04 4.55E−03 chromosome 11 openreading 33722286 frame 91 31 MDFI chr6: 41606194- 0.59 7.08 3.59 3.135.00E−05 7.97E−04 MyoD family inhibitor 41621982 32 LMO7DN chr13:76445173- 0.15 1.72 3.54 1.84 1.00E−04 1.51E−03 LMO7 downstream neighbor76457948 33 KCTD14 chr11: 77726760- 0.30 3.52 3.53 0.59 5.90E−034.65E−02 potassium channel tetramerization 77850699 domain containing 1434 SSTR5-AS1 chr16: 1114081- 0.49 5.47 3.47 2.61 5.00E−05 7.97E−04 SSTR5antisense RNA 1 1131454 35 KCP chr7: 128516918- 0.32 3.31 3.37 3.115.00E−05 7.97E−04 kielin/chordin-like protein 128550773 36 CLDN14 chr21:37832919- 0.16 1.64 3.37 1.29 5.00E−05 7.97E−04 claudin 14 37948867 37DUOX2 chr15: 45384851- 11.84 118.72 3.33 1.76 5.00E−05 7.97E−04 dualoxidase 2 45406359 38 GJC2 chr1: 228337414- 0.16 1.59 3.30 1.96 1.50E−042.18E−03 gap junction protein, gamma 2, 228347527 47 kDa 39 SLC6A20chr3: 45796940- 1.53 15.01 3.30 4.52 5.00E−05 7.97E−04 solute carrierfamily 6 (proline 45838035 IMINO transporter), member 20 40 C2CD4Achr15: 62359175- 0.24 2.24 3.24 2.41 5.00E−05 7.97E−04 C2calcium-dependent domain 62363116 containing 4A 41 MYEOV chr11:69061621- 2.61 24.67 3.24 3.75 5.00E−05 7.97E−04 myeloma overexpressed69064754 42 TFAP2A chr6: 10396915- 0.20 1.80 3.20 2.14 5.00E−05 7.97E−04transcription factor AP-2 alpha 10419797 (activating enhancer bindingprotein 2 alpha) 43 TNFSF9 chr19: 6531009- 0.21 1.75 3.08 1.67 5.00E−057.97E−04 tumor necrosis factor (ligand) 6535939 superfamily, member 9 44SEC14L2 chr22: 30792929- 0.44 3.72 3.07 2.89 5.00E−05 7.97E−04SEC14-like lipid binding 2 30821291 45 PLA2G2F chr1: 20465822- 0.16 1.242.98 1.45 5.00E−05 7.97E−04 phospholipase A2, group IIF 20476879 46C6orf223 chr6: 43968336- 0.97 7.57 2.97 2.90 5.00E−05 7.97E−04chromosome 6 open reading frame 43973694 223 47 IRX2 chr5: 2746278- 0.191.50 2.94 1.90 5.00E−05 7.97E−04 iroquois homeobox 2 2751769 48 SIM2chr21: 38071990- 0.27 2.06 2.92 2.28 5.00E−05 7.97E−04 single-mindedfamily bHLH 38122510 transcription factor 2 49 SLC4A11 chr20: 3208062-0.31 2.32 2.91 2.39 5.00E−05 7.97E−04 solute carrier family 4, sodium3219887 borate transporter, member 11 50 TTC9 chr14: 71108503- 0.62 4.642.91 3.31 5.00E−05 7.97E−04 tetratricopeptide repeat domain 9 7114207751 TMPRSS3 chr21: 43791995- 0.53 3.86 2.87 2.27 5.00E−05 7.97E−04transmembrane protease, serine 3 43816955 52 SLC16A4 chr1: 110905472-1.08 7.54 2.81 2.86 5.00E−05 7.97E−04 solute carrier family 16, member 4110933704 53 XKR9 chr8: 71581599- 0.64 4.40 2.79 2.88 5.00E−05 7.97E−04XK, Kell blood group complex 71648177 subunit-related family, member 954 IL1RN chr2: 113875469- 3.40 23.19 2.77 3.42 5.00E−05 7.97E−04interleukin 1 receptor antagonist 113891593 55 C2CD4B chr15: 62455736-0.53 3.62 2.76 2.03 5.00E−05 7.97E−04 C2 calcium-dependent domain62457482 containing 4B 56 IL1B chr2: 113587336- 1.41 9.47 2.74 2.675.00E−05 7.97E−04 interleukin 1, beta 113594356 57 DSG3 chr18: 29027731-0.21 1.37 2.72 2.37 5.00E−05 7.97E−04 desmoglein 3 29058665 58 ANXA1chr9: 75766780- 25.34 161.24 2.67 4.39 5.00E−05 7.97E−04 annexin A175785307 59 MTCL1 chr18: 8717368- 0.48 2.95 2.63 2.94 5.00E−05 7.97E−04microtubule crosslinking factor 1 8832775 60 CXCL1 chr4: 74735108- 3.8123.37 2.62 3.02 5.00E−05 7.97E−04 chemokine (C-X-C motif) ligand 174737019 (melanoma growth stimulating activity, alpha) 61 CHAC1 chr15:41245635- 1.72 10.35 2.59 2.51 5.00E−05 7.97E−04 ChaCglutathione-specific gamma- 41248717 glutamylcyclotransferase 1 62 CXCL3chr4: 74902311- 2.22 13.30 2.58 2.52 5.00E−05 7.97E−04 chemokine (C-X-Cmotif) ligand 3 74904490 63 AHNAK2 chr14: 105403590- 0.28 1.70 2.58 3.205.00E−05 7.97E−04 AHNAK nucleoprotein 2 105444694 64 IL33 chr9: 6215785-1.69 10.04 2.57 3.21 5.00E−05 7.97E−04 interleukin 33 6257983 65 SYT12chr11: 66790189- 0.27 1.57 2.53 1.60 5.00E−05 7.97E−04 synaptotagmin XII66818334 66 IRAK2 chr3: 10206562- 1.07 6.19 2.53 2.86 5.00E−05 7.97E−04interleukin-1 receptor-associated 10285427 kinase 2 67 TM4SF1 chr3:149086804- 24.36 139.66 2.52 4.27 5.00E−05 7.97E−04 transmembrane 4 Lsix family 149104370 member 1 68 CDSN chr6_qbl_hap6: 0.68 3.84 2.50 1.414.00E−04 5.06E−03 corneodesmosin 2378425- 2403713 69 GRHL1 chr2:10091791- 0.24 1.34 2.45 1.92 5.00E−05 7.97E−04 grainyhead-liketranscription factor 10142412 1 70 PHLDA1 chr12: 76419226- 2.67 14.502.44 3.70 5.00E−05 7.97E−04 pleckstrin homology-like domain, 76425556family A, member 1 71 SERPINE2 chr2: 224839764- 8.44 43.34 2.36 3.565.00E−05 7.97E−04 serpin peptidase inhibitor, clade E 224904036 (nexin,plasminogen activator inhibitor type 1), member 2 72 SRD5A3 chr4:56212387- 2.95 14.76 2.32 3.49 5.00E−05 7.97E−04 steroid 5alpha-reductase 3 56251747 73 PLEKHN1 chr1: 901876- 0.47 2.37 2.32 1.945.00E−05 7.97E−04 pleckstrin homology domain 910484 containing, family Nmember 1 74 DUOX1 chr15: 45422191- 0.43 2.11 2.28 2.25 5.00E−05 7.97E−04dual oxidase 1 45457776 75 FADS2 chr11: 61567096- 2.55 12.27 2.26 2.135.00E−05 7.97E−04 fatty acid desaturase 2 61634826 76 PHLDA2 chr11:2949502- 15.13 72.46 2.26 2.94 5.00E−05 7.97E−04 pleckstrinhomology-like domain, 2950650 family A, member 2 77 RTN4R chr22:20228937- 1.47 6.95 2.24 2.32 5.00E−05 7.97E−04 reticulon 4 receptor20255816 78 WFDC21P chr17: 58160926- 1.09 5.00 2.20 1.49 5.00E−057.97E−04 WAP four-disulfide core domain 58165828 21, pseudogene 79TNFRSF11B chr8: 119935795- 1.96 8.83 2.17 2.72 5.00E−05 7.97E−04 tumornecrosis factor receptor 119964383 superfamily, member 11b 80 CLIC3chr9: 139889059- 2.05 9.10 2.15 1.72 5.00E−05 7.97E−04 chlorideintracellular channel 3 139891024 81 S100A2 chr1: 153533584- 0.55 2.432.14 1.27 4.00E−04 5.06E−03 S100 calcium binding protein A2 153538306 82NAT8 chr2: 73867849- 1.26 5.56 2.14 1.61 5.00E−05 7.97E−04N-acetyltransferase 8 (GCN5- 73869537 related, putative) 83 TRIM7 chr5:180620923- 3.12 13.61 2.13 2.07 5.00E−05 7.97E−04 tripartite motifcontaining 7 180632293 84 RGS2 chr1: 192778168- 10.92 47.56 2.12 3.335.00E−05 7.97E−04 regulator of G-protein signaling 2 192781407 85ZSCAN12P1 chr6: 28058584- 0.55 2.36 2.09 2.02 5.00E−05 7.97E−04 zincfinger and SCAN domain 28063493 containing 12 pseudogene 1 86 ELFN1-AS1chr7: 1748797- 0.39 1.64 2.08 0.92 5.15E−03 4.16E−02 ELFN1 antisense RNA1 1787590 87 TNFRSF12A chr16: 3070312- 7.12 29.77 2.06 2.52 5.00E−057.97E−04 tumor necrosis factor receptor 3072383 superfamily, member 12A88 MYPN chr10: 69865873- 0.38 1.60 2.06 1.99 5.00E−05 7.97E−04myopalladin 69971773 89 KIAA0895 chr7: 36363758- 0.66 2.70 2.04 1.286.00E−04 7.22E−03 KIAA0895 36493401 90 ENTPD3 chr3: 40428646- 0.60 2.442.03 1.68 5.00E−05 7.97E−04 ectonucleoside triphosphate 40494799diphosphohydrolase 3 91 ADAMTSL5 chr19: 1505016- 1.06 4.31 2.02 2.035.00E−05 7.97E−04 ADAMTS-like 5 1513188 92 STX1A chr7: 73113534- 1.244.95 2.00 1.94 5.00E−05 7.97E−04 syntaxin 1A (brain) 73134017 93 ME1chr6: 83920109- 7.99 31.89 2.00 3.33 5.00E−05 7.97E−04 malic enzyme 1,NADP(+)- 84140938 dependent, cytosolic 94 ANXA3 chr4: 79472741- 18.2672.68 1.99 3.37 5.00E−05 7.97E−04 annexin A3 79531605 95 TM4SF1-AS1chr3: 149086804- 0.27 1.07 1.99 0.22 4.75E−03 3.89E−02 TM4SF1 antisenseRNA 1 149104370 96 C10orf10 chr10: 45455218- 3.11 12.27 1.98 1.321.05E−03 1.15E−02 chromosome 10 open reading 45490172 frame 10 97 LRRC8Echr19: 7953389- 0.60 2.37 1.97 1.94 5.00E−05 7.97E−04 leucine richrepeat containing 8 7966908 family, member E 98 CATSPERB chr14:92047117- 0.86 3.37 1.97 2.20 5.00E−05 7.97E−04 catsper channelauxiliary subunit 92198413 beta 99 CCNO chr5: 54526980- 1.46 5.71 1.961.57 5.00E−05 7.97E−04 cyclin O 54529508 100 MIR614 chr12: 13068762-673.60 2602.29 1.95 1.42 5.00E−05 7.97E−04 microRNA 614 13068852 101C17orf67 chr17: 54869273- 0.98 3.74 1.94 1.79 5.00E−05 7.97E−04chromosome 17 open reading 54911256 frame 67 102 ETV4 chr17: 41605210-2.02 7.73 1.93 2.11 5.00E−05 7.97E−04 ets variant 4 41623800 103 CAPN11chr6: 44126547- 0.65 2.47 1.92 1.54 5.00E−05 7.97E−04 calpain 1144152139 104 SMOX chr20: 4129425- 2.21 8.16 1.89 2.12 5.00E−05 7.97E−04spermine oxidase 4168394 105 ARX chrX: 25021812- 0.34 1.26 1.87 1.401.50E−04 2.18E−03 aristaless related homeobox 25034065 106 TSPAN5 chr4:99391517- 1.40 5.06 1.85 2.24 5.00E−05 7.97E−04 tetraspanin 5 99579812107 TIMP1 chrX: 47420498- 61.09 216.38 1.82 1.94 5.00E−05 7.97E−04 TIMPmetallopeptidase inhibitor 1 47479256 108 ARNTL2 chr12: 27485786- 0.853.00 1.82 2.44 5.00E−05 7.97E−04 aryl hydrocarbon receptor nuclear27599567 translocator-like 2 109 PARP8 chr5: 49961732- 1.43 5.02 1.812.66 5.00E−05 7.97E−04 poly (ADP-ribose) polymerase 50142356 family,member 8 110 OTUB2 chr14: 94492723- 0.65 2.27 1.81 1.87 5.00E−057.97E−04 OTU deubiquitinase, ubiquitin 94515276 aldehyde binding 2 111MPP3 chr17: 41878166- 0.53 1.86 1.81 1.72 5.00E−05 7.97E−04 membraneprotein, palmitoylated 3 41910547 (MAGUK p55 subfamily member 3) 112EIF5A2 chr3: 170606203- 0.33 1.14 1.78 1.82 5.00E−05 7.97E−04 eukaryotictranslation initiation 170626426 factor 5A2 113 RHOD chr11: 66824288-8.98 30.71 1.77 2.30 5.00E−05 7.97E−04 ras homolog family member D66839488 114 ASPHD2 chr22: 26825279- 1.82 6.19 1.77 2.16 5.00E−057.97E−04 aspartate beta-hydroxylase domain 26840978 containing 2 115STRIP2 chr7: 129074273- 0.80 2.73 1.77 1.87 5.00E−05 7.97E−04 striatininteracting protein 2 129128239 116 INSC chr11: 15133969- 1.32 4.50 1.771.90 5.00E−05 7.97E−04 inscuteable homolog (Drosophila) 15268756 117SPTBN2 chr11: 66452719- 0.65 2.21 1.76 2.04 5.00E−05 7.97E−04 spectrin,beta, non-erythrocytic 2 66488870 118 PF4 chr4: 74846541- 0.42 1.40 1.751.00 3.05E−03 2.72E−02 platelet factor 4 74847841 119 ALPPL2 chr2:233271551- 0.40 1.32 1.74 1.26 5.00E−05 7.97E−04 alkaline phosphatase,placental- 233275424 like 2 120 CD82 chr11: 44587140- 18.88 62.74 1.732.81 5.00E−05 7.97E−04 CD82 molecule 44641315 121 HAPLN4 chr19:19366451- 0.53 1.75 1.72 1.53 5.00E−05 7.97E−04 hyaluronan andproteoglycan link 19373596 protein 4 122 LOC100130705 chr7: 128506463-0.78 2.56 1.72 1.65 5.00E−05 7.97E−04 uncharacterized LOC100130705128512101 123 ARFGAP3 chr22: 43192531- 13.97 45.68 1.71 2.97 5.00E−057.97E−04 ADP-ribosylation factor GTPase 43253408 activating protein 3124 LDLR chr19: 11200037- 11.17 36.20 1.70 2.74 5.00E−05 7.97E−04 lowdensity lipoprotein receptor 11244505 125 UNC13D chr17: 73823307- 9.6430.97 1.68 2.84 5.00E−05 7.97E−04 unc-13 homolog D (C. elegans) 73840798126 GPRIN1 chr5: 176022802- 2.98 9.37 1.66 2.31 5.00E−05 7.97E−04 Gprotein regulated inducer of 176037131 neurite outgrowth 1 127 LAMC2chr1: 183155173- 15.08 47.41 1.65 2.58 5.00E−05 7.97E−04 laminin, gamma2 183214262 128 TRIB3 chr20: 361307- 2.05 6.39 1.64 1.83 5.00E−057.97E−04 tribbles pseudokinase 3 378203 129 SLC5A9 chr1: 48688356- 0.973.02 1.64 1.78 5.00E−05 7.97E−04 solute carrier family 5 48714316(sodium/sugar cotransporter), member 9 130 ITGA2 chr5: 52285155- 4.9315.24 1.63 2.75 5.00E−05 7.97E−04 integrin, alpha 2 (CD49B, alpha 252390609 subunit of VLA-2 receptor) 131 SESTD1 chr2: 179966418- 1.584.88 1.62 2.74 5.00E−05 7.97E−04 SEC14 and spectrin domains 1 180129350132 SIGLEC12 chr19: 51994480- 0.80 2.42 1.60 1.38 5.00E−05 7.97E−04sialic acid binding Ig-like lectin 12 52005043 (gene/pseudogene) 133HSPA4L chr4: 128703452- 0.67 2.02 1.60 1.64 5.00E−05 7.97E−04 heat shock70 kDa protein 4-like 128754526 134 BTNL9 chr5: 180467224- 1.12 3.361.59 1.69 5.00E−05 7.97E−04 butyrophilin-like 9 180488523 135 BACE2chr21: 42539727- 8.52 25.38 1.58 2.73 5.00E−05 7.97E−04 beta-siteAPP-cleaving enzyme 2 42654461 136 PRDM8 chr4: 81106423- 0.71 2.09 1.561.40 5.00E−05 7.97E−04 PR domain containing 8 81125482 137 TUBB2A chr6:3153901- 6.96 20.52 1.56 2.16 5.00E−05 7.97E−04 tubulin, beta 2A classIla 3157783 138 ZG16B chr16: 2880172- 25.29 73.85 1.55 2.33 5.00E−057.97E−04 zymogen granule protein 16B 2882285 139 CELSR3 chr3: 48673895-0.49 1.42 1.53 1.83 5.00E−05 7.97E−04 cadherin, EGF LAG seven-pass G-48700348 type receptor 3 140 MET chr7: 116312458- 6.21 17.84 1.52 2.695.00E−05 7.97E−04 MET proto-oncogene, receptor 116438440 tyrosine kinase141 TMEM163 chr2: 135213329- 0.48 1.37 1.51 1.03 4.00E−04 5.06E−03transmembrane protein 163 135476571 142 SLC28A3 chr9: 86890764- 0.381.08 1.49 1.38 5.00E−05 7.97E−04 solute carrier family 28 86983413(concentrative nucleoside transporter), member 3 143 LPL chr8: 19796581-0.38 1.05 1.48 1.30 5.00E−05 7.97E−04 lipoprotein lipase 19824770 144TRIM16 chr17: 15531279- 4.49 12.55 1.48 2.22 5.00E−05 7.97E−04tripartite motif containing 16 15586193 145 TPK1 chr7: 144149033- 4.4712.38 1.47 2.13 5.00E−05 7.97E−04 thiamin pyrophosphokinase 1 144533146146 ADM2 chr22: 50919984- 1.61 4.46 1.47 1.77 5.00E−05 7.97E−04adrenomedullin 2 50924866 147 C8G chr9: 139839697- 3.94 10.79 1.45 1.355.00E−05 7.97E−04 complement component 8, gamma 139841426 polypeptide148 S100A11 chr1: 152004981- 295.16 806.03 1.45 2.44 5.00E−05 7.97E−04S100 calcium binding protein A11 152009511 149 RAPGEF3 chr12: 48128452-1.60 4.34 1.44 1.96 5.00E−05 7.97E−04 Rap guanine nucleotide exchange48152889 factor (GEF) 3 150 TM4SF5 chr17: 4675186- 21.66 58.69 1.44 1.915.00E−05 7.97E−04 transmembrane 4 L six family 4686506 member 5 151 BMP7chr20: 55743808- 0.87 2.36 1.44 1.53 5.00E−05 7.97E−04 bonemorphogenetic protein 7 55841707 152 SYT8 chr11: 1855539- 0.80 2.15 1.431.08 5.50E−04 6.69E−03 synaptotagmin VIII 1858751 153 SSUH2 chr3:8661085- 3.55 9.57 1.43 1.70 5.00E−05 7.97E−04 ssu-2 homolog (C.elegans) 8693764 154 DPP4 chr2: 162848754- 12.22 32.97 1.43 2.435.00E−05 7.97E−04 dipeptidyl-peptidase 4 162931052 155 FAM83H-AS1 chr8:144816309- 2.07 5.56 1.43 1.71 5.00E−05 7.97E−04 FAM83H antisense RNA 1(head 144828507 to head) 156 LM07 chr13: 76123615- 24.92 66.89 1.42 2.195.00E−05 7.97E−04 LIM domain 7 76434006 157 SLC7A4 chr22: 21383006- 1.463.88 1.41 1.29 5.00E−05 7.97E−04 solute carrier family 7, 21386847member 4 158 MLPH chr2: 238395052- 20.32 53.64 1.40 2.42 5.00E−057.97E−04 melanophilin 238463961 159 ERRFI1 chr1: 8071778- 12.88 34.001.40 2.39 5.00E−05 7.97E−04 ERBB receptor feedback 8086393 inhibitor 1160 ARHGAP29 chr1: 94634462- 0.55 1.44 1.40 1.76 5.00E−05 7.97E−04 RhoGTPase activating protein 29 94703307 161 MUC3A chr7: 100547051- 25.3866.35 1.39 2.26 5.00E−05 7.97E−04 mucin 3A, cell surface associated100611619 162 ISG15 chr1: 948846- 20.73 53.92 1.38 1.84 5.00E−057.97E−04 ISG15 ubiquitin-like modifier 949919 163 PMAIP1 chr18:57567191- 1.70 4.39 1.37 1.50 5.00E−05 7.97E−04phorbol-12-myristate-13-acetate- 57571538 induced protein 1 164 ZMYND15chr17: 4643309- 1.08 2.80 1.37 1.29 5.00E−05 7.97E−04 zinc finger,MYND-type 4649414 containing 15 165 PPL chr16: 4932507- 1.85 4.77 1.371.98 5.00E−05 7.97E−04 periplakin 4987136 166 LPGAT1 chr1: 211916798-8.03 20.49 1.35 2.52 5.00E−05 7.97E−04 lysophosphatidylglycerol212004114 acyltransferase 1 167 AGR2 chr7: 16832263- 1120.97 2856.301.35 1.43 5.00E−05 7.97E−04 anterior gradient 2, protein 16844738disulphide isomerase family member 168 GATSL2 chr7: 74601103- 0.61 1.541.34 1.53 5.00E−05 7.97E−04 GATS protein-like 2 74988276 169 CDKN2Achr9: 21967137- 0.56 1.43 1.34 0.76 3.45E−03 3.01E−02 cyclin-dependentkinase 21994490 inhibitor 2A 170 ZC3H12A chr1: 37940118- 8.19 20.58 1.332.08 5.00E−05 7.97E−04 zinc finger CCCH-type 37949978 containing 12A 171ALDH3B1 chr11: 67776016- 9.36 23.45 1.33 2.17 5.00E−05 7.97E−04 aldehydedehydrogenase 3 family, 67796749 member B1 172 NDUFC2- chr11: 77726760-9.47 23.70 1.32 1.11 2.50E−04 3.38E−03 NDUFC2-KCTD14 readthrough KCTD1477850699 173 GIPR chr19: 46171501- 5.98 14.87 1.31 1.75 5.00E−057.97E−04 gastric inhibitory polypeptide 46185717 receptor 174 ZNF165chr6: 28048481- 1.69 4.20 1.31 1.51 5.00E−05 7.97E−04 zinc fingerprotein 165 28057340 175 GMPR chr6: 16238810- 2.15 5.32 1.31 1.375.00E−05 7.97E−04 guanosine monophosphate 16295780 reductase 176 RTN2chr19: 45988545- 3.17 7.83 1.30 1.18 5.00E−05 7.97E−04 reticulon 246000313 177 TYRO3 chr15: 41851219- 0.55 1.36 1.30 1.23 2.00E−042.80E−03 TYRO3 protein tyrosine kinase 41871536 178 VSTM5 chr11:93553734- 1.51 3.72 1.30 0.93 1.90E−03 1.87E−02 V-set and transmembranedomain 93583668 containing 5 179 KYNU chr2: 143635194- 1.31 3.20 1.291.26 5.00E−05 7.97E−04 kynureninase 143799885 180 SERPINB8 chr18:61637262- 4.71 11.46 1.28 1.87 5.00E−05 7.97E−04 serpin peptidaseinhibitor, 2-61656608 clade B (ovalbumin), member 8 181 RSAD2 chr2:7017795- 6.48 15.78 1.28 1.82 5.00E−05 7.97E−04 radical S-adenosylmethionine 7038363 domain containing 2 182 PLAUR chr19: 44150246- 22.0453.41 1.28 2.05 5.00E−05 7.97E−04 plasminogen activator, 44174498urokinase receptor 183 TM6SF2 chr19: 19375173- 10.03 24.31 1.28 1.735.00E−05 7.97E−04 transmembrane 6 superfamily 19384074 member 2 184MMP12 chr11: 102733463- 15.23 36.65 1.27 1.87 5.00E−05 7.97E−04 matrixmetallopeptidase 12 102745764 185 VSIG10L chr19: 51834794- 1.04 2.491.26 1.29 5.00E−05 7.97E−04 V-set and immunoglobulin domain 51845378containing 10 like 186 CXCL16 chr17: 4634722- 17.88 42.76 1.26 1.885.00E−05 7.97E−04 chemokine (C-X-C motif) 4643223 ligand 16 187 ABHD2chr15: 89631380- 24.14 57.42 1.25 1.94 5.00E−05 7.97E−04 abhydrolasedomain containing 2 89745591 188 MAPK15 chr8: 144798506- 0.60 1.42 1.250.99 1.95E−03 1.91E−02 mitogen-activated protein kinase 144804633 15 189FGF2 chr4: 123747862- 0.49 1.17 1.24 0.97 2.40E−03 2.25E−02 fibroblastgrowth factor 2 123844159 (basic) 190 ADAM9 chr8: 38854504- 29.81 70.071.23 2.16 5.00E−05 7.97E−04 ADAM metallopeptidase domain 9 38962779 191CYP2S1 chr19: 41699114- 26.65 62.36 1.23 1.98 5.00E−05 7.97E−04cytochrome P450, family 2, 41713444 subfamily S, polypeptide 1 192 USTchr6: 149068270- 0.69 1.60 1.22 1.24 5.00E−05 7.97E−04uronyl-2-sulfotransferase 149398126 193 AP1S3 chr2: 224620046- 2.04 4.741.22 1.62 5.00E−05 7.97E−04 adaptor-related protein complex 224702319 1,sigma 3 subunit 194 PITPNC1 chr17: 65373396- 1.64 3.81 1.22 1.745.00E−05 7.97E−04 phosphatidylinositol transfer 65693379 protein,cytoplasmic 1 195 ST3GAL1 chr8: 134467090- 2.45 5.62 1.20 1.44 5.00E−057.97E−04 ST3 beta-galactoside alpha-2,3- 134584183 sialyltransferase 1196 GNA15 chr19: 3136029- 1.96 4.48 1.19 1.32 5.00E−05 7.97E−04 guaninenucleotide binding 3163767 protein (G protein), alpha 15 (Gq class) 197PVRL4 chr1: 161040780- 1.50 3.42 1.19 1.29 5.00E−05 7.97E−04 poliovirusreceptor-related 4 161059385 198 TSTA3 chr8: 144694787- 49.13 111.801.19 2.14 5.00E−05 7.97E−04 tissue specific transplantation 144699732antigen P35B 199 SCD chr10: 102106771- 23.98 54.51 1.18 1.65 5.00E−057.97E−04 stearoyl-CoA desaturase (delta- 102124588 9-desaturase) 200HTATSF1P2 chr6: 3020389- 0.63 1.43 1.18 1.22 5.00E−05 7.97E−04 HIV-1 Tatspecific factor 1 3025005 pseudogene 2 201 AJUBA chr14: 23440382- 0.811.83 1.18 1.15 5.00E−05 7.97E−04 ajuba LIM protein 234518516- 202 PHLDA3chr1: 201434606- 4.80 10.83 1.17 1.48 5.00E−05 7.97E−04 pleckstrinhomology-like domain, 201438299 family A, member 3 203 C1orf116 chr1:207191865- 10.04 22.65 1.17 2.07 5.00E−05 7.97E−04 chromosome 1 openreading frame 207206101 116 204 C6orf141 chr6: 49518112- 3.21 7.24 1.171.26 5.00E−05 7.97E−04 chromosome 6 open reading frame 49519808 141 205TMC7 chr16: 18995255- 3.65 8.19 1.16 1.33 5.00E−05 7.97E−04transmembrane channel-like 7 19091417 206 KIAA1549 chr7: 138516126- 0.420.95 1.16 1.39 5.00E−05 7.97E−04 KIAA1549 138666064 207 DAPP1 chr4:100737980- 2.00 4.48 1.16 1.49 5.00E−05 7.97E−04 dual adaptor ofphosphotyrosine 100791346 and 3-phosphoinositides 208 ZNF432 chr19:52536676- 2.29 5.12 1.16 1.49 5.00E−05 7.97E−04 zinc finger protein 43252552073 209 DUSP5 chr10: 112257624- 14.15 31.63 1.16 1.87 5.00E−057.97E−04 dual specificity phosphatase 5 112271302 210 ISG20 chr15:89182038- 32.62 72.51 1.15 1.94 5.00E−05 7.97E−04 interferon stimulated89198879 exonuclease gene 20 kDa 211 TPM4 chr19: 16178316- 129.57 287.541.15 1.72 5.00E−05 7.97E−04 tropomyosin 4 16213813 212 PTPN13 chr4:87515467- 0.60 1.32 1.15 1.35 5.00E−05 7.97E−04 protein tyrosinephosphatase, 87736328 non-receptor type 13 (APO-1/ CD95 (Fas)-associatedphosphatase) 213 AMOTL2 chr3: 134074186- 1.90 4.20 1.15 1.52 5.00E−057.97E−04 angiomotin like 2 134094259 214 PLK3 chr1: 45266035- 3.02 6.681.14 1.38 5.00E−05 7.97E−04 polo-like kinase 3 45272957 215 ADCY4 chr14:24787554- 2.88 6.34 1.14 1.48 5.00E−05 7.97E−04 adenylate cyclase 424804277 216 TNS4 chr17: 38632079- 3.51 7.71 1.14 1.67 5.00E−05 7.97E−04tensin 4 38657854 217 CITED4 chr1: 41326727- 1.39 3.06 1.14 0.914.10E−03 3.46E−02 Cbp/p300-interacting 41328018 transactivator, withGlu/Asp- rich carboxy-terminal domain, 4 218 GLRA4 chrX: 102962271- 0.601.31 1.13 0.82 4.10E−03 3.46E−02 glycine receptor, alpha 4 102983552 219ASPH chr8: 62200524- 31.72 69.58 1.13 1.92 5.00E−05 7.97E−04 aspartatebeta-hydroxylase 62627199 220 MSMO1 chr4: 166248817- 39.79 87.24 1.131.81 5.00E−05 7.97E−04 methylsterol monooxygenase 1 166264314 221 DDIT4chr10: 74033676- 23.50 51.22 1.12 1.88 5.00E−05 7.97E−04DNA-damage-inducible 74035797 transcript 4 222 PAM chr5: 102201526- 8.7318.99 1.12 2.01 5.00E−05 7.97E−04 peptidylglycine alpha-amidating102366808 monooxygenase 223 SHF chr15: 45459411- 1.29 2.80 1.12 1.145.00E−05 7.97E−04 Src homology 2 domain 45493373 containing F 224ARHGEF3 chr3: 567614451- 2.38 5.17 1.12 1.51 5.00E−05 7.97E−04 Rhoguanine nucleotide exchange 57113336 factor (GEF) 3 225 LPCAT1 chr5:1461541- 7.26 15.73 1.11 1.77 5.00E−05 7.97E−04 lysophosphatidylcholine1524076 acyltransferase 1 226 ACY3 chr11: 67410025- 8.03 17.34 1.11 1.455.00E−05 7.97E−04 aminoacylase 3 67418130 227 PLXNA3 chrX: 153686620-8.18 17.63 1.11 1.92 5.00E−05 7.97E−04 plexin A3 153701989 228 SEC24Dchr4: 119643977- 14.58 31.42 1.11 1.95 5.00E−05 7.97E−04 SEC24 homologD, COPII coat 119757326 complex component 229 AKR1B15 chr7: 134233848-2.91 6.27 1.10 1.18 5.00E−05 7.97E−04 aldo-keto reductase family 1,134264592 member B15 230 SLC22A15 chr1: 116519118- 0.64 1.38 1.10 1.135.00E−05 7.97E−04 solute carrier family 22, 116612675 member 15 231MACC1 chr7: 19958603- 2.20 4.72 1.10 1.80 5.00E−05 7.97E−04 metastasisassociated in colon 20257013 cancer 1 232 FAM86DP chr3: 75470702- 1.683.60 1.10 1.23 5.00E−05 7.97E−04 family with sequence similarity75484266 86, member D, pseudogene 233 DAPK1 chr9: 90112142- 5.72 12.281.10 1.86 5.00E−05 7.97E−04 death-associated protein 90323549 kinase 1234 TSPAN8 chr12: 71518876- 1477.14 3166.40 1.10 1.17 7.00E−04 8.19E−03tetraspanin 8 71551779 235 EPS8L1 chr19: 55587220- 18.44 39.39 1.09 1.895.00E−05 7.97E−04 EPS8-like 1 55599291 236 IL1R2 chr2: 102608305- 38.2381.56 1.09 1.83 5.00E−05 7.97E−04 interleukin 1 receptor, type II102644884 237 AGRN chr1: 955502- 14.45 30.82 1.09 1.96 5.00E−05 7.97E−04aqrin 991499 238 ACY1 chr3: 52009041- 54.72 116.70 1.09 1.89 5.00E−057.97E−04 aminoacylase 1 52023218 239 ATP13A2 chr1: 17312452- 17.04 36.301.09 1.94 5.00E−05 7.97E−04 ATPase type 13A2 17338467 240 SFXN3 chr10:102790995- 7.52 16.00 1.09 1.75 5.00E−05 7.97E−04 sideroflexin 3102800998 241 SGMS2 chr4: 108745720- 8.11 17.18 1.08 2.01 5.00E−057.97E−04 sphingomyelin synthase 2 108836204 242 KLK13 chr19: 51559462-0.77 1.64 1.08 0.79 4.00E−03 3.39E−02 kallikrein-related peptidase 1351568367 243 KPNA7 chr7: 98771196- 0.58 1.23 1.08 0.79 5.45E−03 4.37E−02karyopherin alpha 7 (importin 98805089 alpha 8) 244 CHPF chr2:220403668- 15.25 32.27 1.08 1.95 5.00E−05 7.97E−04 chondroitinpolymerizing factor 220408487 245 CCR6 chr6: 167525294- 1.61 3.41 1.081.26 5.00E−05 7.97E−04 chemokine (C-C motif) 167552629 receptor 6 246VWA1 chr1: 1370902- 5.71 12.01 1.07 1.78 5.00E−05 7.97E−04 vonWillebrand factor A domain 1378262 containing 1 247 MEIS2 chr15:37183221- 0.75 1.57 1.06 0.98 4.00E−04 5.06E−03 Meis homeobox 2 37393500248 USP18 chr22: 18632757- 3.62 7.53 1.06 1.35 5.00E−05 7.97E−04ubiquitin specific peptidase 18 18660162 249 LIMCH1 chr4: 41361623- 1.513.13 1.05 1.42 5.00E−05 7.97E−04 LIM and calponin homology 41702061domains 1 250 PERP chr6: 138409641- 47.52 97.23 1.03 1.83 5.00E−057.97E−04 PERP, TP53 apoptosis effector 138428660 251 GPX2 chr14:65381078- 168.48 344.46 1.03 1.18 5.00E−05 7.97E−04 glutathioneperoxidase 2 65569413 252 TIMP4 chr3: 12045833- 1.27 2.60 1.03 0.891.95E−03 1.91E−02 TIMP metallopeptidase 12233532 inhibitor 4 253 DUSP14chr17: 35849950- 1.53 3.13 1.03 0.93 1.65E−03 1.66E−02 dual specificityphosphatase 14 35873588 254 TMEM150B chr19: 55824168- 25.21 51.40 1.031.52 5.00E−05 7.97E−04 transmembrane protein 150B 55836708 255 C8orf4chr8: 40010986- 15.29 31.16 1.03 1.74 5.00E−05 7.97E−04 chromosome 8open reading frame 40012827 4 256 DHRS9 chr2: 169923544- 148.32 302.301.03 1.67 5.00E−05 7.97E−04 dehydrogenase/reductase (SDR 169952677family) member 9 257 LAMA3 chr18: 21269561- 16.87 34.36 1.03 1.755.00E−05 7.97E−04 laminin, alpha 3 21535029 258 TRIM47 chr17: 73870244-12.77 25.99 1.03 1.70 5.00E−05 7.97E−04 tripartite motif containing 4773874656 259 IFIT1 chr10: 91152302- 1.69 3.42 1.02 1.32 5.00E−057.97E−04 interferon-induced protein with 91166244 tetratricopeptiderepeats 1 260 PGM3 chr6: 83777384- 10.48 21.26 1.02 1.33 5.00E−057.97E−04 phosphoglucomutase 3 83906256 261 NABP1 chr2: 192542797- 7.6315.43 1.02 1.75 5.00E−05 7.97E−04 nucleic acid binding protein 1192553248 262 FCHO1 chr19: 17858526- 2.21 4.46 1.01 1.26 5.00E−057.97E−04 FCH domain only 1 17899377 263 PNMA1 chr14: 74178485- 10.7921.73 1.01 1.63 5.00E−05 7.97E−04 paraneoplastic Ma antigen 1 74181128264 PPP1R13L chr19: 45882891- 6.68 13.43 1.01 0.96 2.05E−03 1.98E−02protein phosphatase 1, 45927177 regulatory subunit 13 like 265 INPP1chr2: 191208195- 15.81 31.68 1.00 1.71 5.00E−05 7.97E−04 inositolpolyphosphate-1- 191236391 phosphatase 266 IL1RL2 chr2: 102803432- 0.501.00 1.00 0.80 2.25E−03 2.14E−02 interleukin 1 receptor-like 2 102855811267 GBP3 chr1: 89472359- 32.08 64.03 1.00 1.71 5.00E−05 7.97E−04guanylate binding protein 3 89488549 268 BAG3 chr10: 121410881- 8.7717.47 0.99 1.61 5.00E−05 7.97E−04 BCL2-associated athanogene 3 121437329269 SLC2A1 chr1: 43391045- 20.36 40.51 0.99 1.71 5.00E−05 7.97E−04solute carrier family 2 43449029 (facilitated glucose transporter),member 1 270 WWC1 chr5: 167719064- 4.39 8.69 0.98 1.65 5.00E−05 7.97E−04WW and C2 domain containing 1 167899308 271 MAPK8IP1 chr11: 45907046-0.72 1.43 0.98 0.92 2.30E−03 2.18E−02 mitogen-activated protein kinase45928016 8 interacting protein 1 272 NOCT chr4: 139936912- 1.80 3.560.98 1.01 2.50E−04 3.38E−03 nocturnin 139967093 273 BHLHE40 chr3:5021096- 14.64 28.84 0.98 1.62 5.00E−05 7.97E−04 basic helix-loop-helixfamily, 5026865 member e40 274 LINC01138 chr1: 143717587- 1.98 3.90 0.980.95 9.00E−04 1.01E−02 long intergenic non-protein 143744519 coding RNA1138 275 RHOF chr12: 122150657- 38.24 75.22 0.98 1.57 5.00E−05 7.97E−04ras homolog family member F (in 122231594 filopodia) 276 FAM214B chr9:35104117- 14.72 28.95 0.98 1.69 5.00E−05 7.97E−04 family with sequencesimilarity 35115893 214, member B 277 SPRED1 chr15: 38545051- 7.77 15.290.98 1.77 5.00E−05 7.97E−04 sprouty-related, EVH1 domain 38649450containing 1 278 NUCB2 chr11: 17298285- 21.28 41.76 0.97 1.80 5.00E−057.97E−04 nucleobindin 2 17353070 279 MST1 chr3: 49721379- 1.97 3.86 0.971.06 2.00E−04 2.80E−03 macrophage stimulating 1 49726196 280 ANTXR2chr4: 80822770- 16.05 31.45 0.97 1.35 5.00E−05 7.97E−04 anthrax toxinreceptor 2 80994626 281 CD59 chr11: 33724555- 38.45 75.33 0.97 1.585.00E−05 7.97E−04 CD59 molecule, complement 33758025 regulatory protein282 ITGB4 chr17: 73717515- 45.05 88.26 0.97 1.64 5.00E−05 7.97E−04integrin, beta 4 73753899 283 CYP4F11 chr19: 16023179- 2.99 5.86 0.971.19 1.50E−04 2.18E−03 cytochrome P450, family 4, 16045676 subfamily F,polypeptide 11 284 LINC01207 chr4: 165675282- 9.06 17.62 0.96 1.635.00E−05 7.97E−04 long intergenic non-protein 165724947 coding RNA 1207285 CCNG2 chr4: 78078356- 16.17 31.43 0.96 1.78 5.00E−05 7.97E−04 cyclinG2 78091213 286 PODN chr1: 53527723- 1.61 3.08 0.94 1.06 5.50E−046.69E−03 podocan 53551174 287 BAIAP2L1 chr7: 97910978- 34.22 65.52 0.941.32 5.00E−05 7.97E−04 BAH-associated protein 2-like 1 98030427 288ITGA6 chr2: 173292313- 49.72 95.14 0.94 1.51 5.00E−05 7.97E−04 integrin,alpha 6 173371181 289 FLJ32255 chr5: 42985500- 1.55 2.95 0.94 0.996.50E−04 7.69E−03 uncharacterized LOC643977 42993435 290 ETV7 chr6:36321997- 2.55 4.88 0.93 0.99 1.20E−03 1.28E−02 ets variant 7 36355577291 LYZ chr12: 69742133- 249.84 477.20 0.93 1.42 5.00E−05 7.97E−04lysozyme 69748013 292 RIPK3 chr14: 24805226- 24.96 47.65 0.93 1.635.00E−05 7.97E−04 receptor-interacting serine- 24809242 threonine kinase3 293 RHBDF1 chr16: 108057- 13.85 26.42 0.93 1.60 5.00E−05 7.97E−04rhomboid 5 homolog 1 122629 (Drosophila) 294 OPTN chr10: 13142081- 28.4654.27 0.93 1.71 5.00E−05 7.97E−04 optineurin 13180276 295 ZNF200 chr16:3272324- 2.58 4.91 0.93 1.20 5.00E−05 7.97E−04 zinc finger protein 2003285457 296 LRRC8A chr9: 131644390- 11.56 21.99 0.93 1.62 5.00E−057.97E−04 leucine rich repeat containing 8 131680317 family, member A 297CNN2 chr19: 1026297- 22.16 42.14 0.93 1.62 5.00E−05 7.97E−04 calponin 21039064 298 NOS3 chr7: 150688143- 0.90 1.71 0.93 0.73 5.75E−03 4.55E−02nitric oxide synthase 3 150721586 (endothelial cell) 299 TNFRSF21 chr6:47199262- 35.07 66.68 0.93 1.62 5.00E−05 7.97E−04 tumor necrosis factorreceptor 47277683 superfamily, member 21 300 PODXL chr7: 131185020- 3.847.30 0.93 1.49 5.00E−05 7.97E−04 podocalyxin-like 131241376 301 TIFABchr5: 134784557- 1.16 2.20 0.92 0.75 4.40E−03 3.65E−02 TRAF-interactingprotein with 134788089 forkhead-associated domain, family member B 302TNFRSF10B chr8: 22844929- 13.00 24.63 0.92 1.43 5.00E−05 7.97E−04 tumornecrosis factor receptor 22941132 superfamily, member 10b 303 SLC25A29chr14: 100757447- 7.64 14.47 0.92 1.38 5.00E−05 7.97E−04 solute carrierfamily 25 100772884 (mitochondrial carnitine/acylcarnitine carrier),member 29 304 PHKA1 chrX: 71798663- 1.33 2.51 0.92 1.18 5.00E−057.97E−04 phosphorylase kinase, alpha 1 71934029 (muscle) 305 ARHGAP10chr4: 148653452- 1.18 2.22 0.92 1.02 2.50E−04 3.38E−03 Rho GTPaseactivating protein 10 148993927 306 PMEPA1 chr20: 56223447- 5.99 11.320.92 1.48 5.00E−05 7.97E−04 prostate transmembrane protein, 56286592androgen induced 1 307 HAPLN3 chr15: 89420518- 1.79 3.37 0.92 0.833.30E−03 2.90E−02 hyaluronan and proteoglycan link 89438770 protein 3308 KRT19 chr17: 39679868- 875.64 1652.07 0.92 1.06 1.50E−04 2.18E−03keratin 19, type I 39684641 309 PPP4R1L chr20: 56807832- 0.90 1.69 0.920.82 4.45E−03 3.68E−02 protein phosphatase 4, regulatory 56884495subunit 1 -like (pseudogene) 310 FHL2 chr2: 105977282- 59.88 112.80 0.911.64 5.00E−05 7.97E−04 four and a half LIM domains 2 106055230 311 GALEchr1: 24122088- 72.79 136.72 0.91 1.64 5.00E−05 7.97E−04UDP-galactose-4-epimerase 24127294 312 BCL2L1 chr20: 30252260- 33.0762.07 0.91 1.64 5.00E−05 7.97E−04 BCL2-like 1 30310656 313 CASP4 chr11:104813593- 16.46 30.81 0.90 1.52 5.00E−05 7.97E−04 caspase 4,apoptosis-related 104839325 cysteine peptidase 314 MXRA8 chr1: 1288068-13.33 24.92 0.90 1.52 5.00E−05 7.97E−04 matrix-remodelling associated 81298921 315 NFKBIZ chr3: 101498028- 16.54 30.88 0.90 1.45 5.00E−057.97E−04 nuclear factor of kappa light 101579869 polypeptide geneenhancer in B- cells inhibitor, zeta 316 LPIN1 chr2: 11817704- 6.1711.50 0.90 1.44 5.00E−05 7.97E−04 lipin 1 11967533 317 NBPF15 chr1:147574322- 1.62 3.02 0.90 1.08 1.50E−04 2.18E−03 neuroblastomabreakpoint family, 149109725 member 15 318 UACA chr15: 70946892- 6.5812.26 0.90 1.71 5.00E−05 7.97E−04 uveal autoantigen with coiled-71055850 coil domains and ankyrin repeats 319 CPD chr17: 28705941- 12.8924.02 0.90 1.67 5.00E−05 7.97E−04 carboxypeptidase D 28796675 320 S100A6chr1: 153507075- 3434.60 6398.04 0.90 0.96 2.85E−03 2.59E−02 S100calcium binding protein A6 153508717 321 TLR4 chr9: 120466452- 3.64 6.760.89 1.46 5.00E−05 7.97E−04 toll-like receptor 4 120479769 322 SYTL1chr1: 27668482- 10.19 18.91 0.89 1.39 5.00E−05 7.97E−04synaptotagmin-like 1 27680423 323 TIMP2 chr17: 76849058- 36.71 68.020.89 1.55 5.00E−05 7.97E−04 TIMP metallopeptidase 76921472 inhibitor 2324 RAB37 chr17: 72667255- 6.30 11.65 0.89 1.16 5.00E−05 7.97E−04 RAB37,member RAS oncogene 72743474 family 325 SORD chr15: 45315301- 12.5523.21 0.89 1.50 5.00E−05 7.97E−04 sorbitol dehydrogenase 45367287 326ARL5B chr10: 18948312- 3.96 7.31 0.88 1.31 5.00E−05 7.97E−04ADP-ribosylation factor-like 5B 18966940 327 MAP3K6 chr1: 27681669- 4.438.16 0.88 1.37 5.00E−05 7.97E−04 mitogen-activated protein kinase27693337 kinase kinase 6 328 FAM57A chr17: 635846- 7.74 14.27 0.88 1.285.00E−05 7.97E−04 family with sequence similarity 646075 57, member A329 PLA2G4C chr19: 48551099- 1.73 3.19 0.88 0.94 1.35E−03 1.41E−02phospholipase A2, group IVC 48614109 (cytosolic, calcium-independent)330 TTC7A chr2: 47129008- 8.10 14.93 0.88 0.99 5.00E−04 6.13E−03tetratricopeptide repeat 47303275 domain 7A 331 SGMS1 chr10: 520653444-3.77 6.95 0.88 1.32 5.00E−05 7.97E−04 sphingomyelin synthase 1 52383737332 SETD7 chr4: 140427191- 5.58 10.24 0.88 1.53 5.00E−05 7.97E−04 SETdomain containing (lysine 140477577 methyltransferase) 7 333 RPL22L1chr3: 170582664- 9.10 16.68 0.87 1.32 5.00E−05 7.97E−04 ribosomalprotein L22-like 1 170588045 334 ANXA2 chr15: 60639349- 576.09 1055.300.87 1.11 2.50E−04 3.38E−03 annexin A2 60690185 335 IBTK chr6: 82879955-14.47 26.46 0.87 1.61 5.00E−05 7.97E−04 inhibitor of Bruton 82957448agammaglobulinemia tyrosine kinase 336 COL9A2 chr1: 40766162- 3.72 6.780.87 1.21 1.50E−04 2.18E−03 collagen, type IX, alpha 2 40782939 337TMEM165 chr4: 56262079- 34.83 63.43 0.86 1.56 5.00E−05 7.97E−04transmembrane protein 165 56292342 338 KCNK6 chr19: 38810483- 14.2325.91 0.86 1.47 5.00E−05 7.97E−04 potassium channel, two pore 38819649domain subfamily K, member 6 339 SP110 chr2: 231033644- 4.45 8.11 0.861.22 5.00E−05 7.97E−04 SP110 nuclear body protein 231090444 340 ORAI3chr16: 30960404- 7.04 12.80 0.86 1.22 5.00E−05 7.97E−04 ORAI calciumrelease-activated 30966259 calcium modulator 3 341 KCNQ4 chr1: 41249683-1.04 1.89 0.86 0.92 1.50E−03 1.53E−02 potassium channel, voltage gated41306124 KQT-like subfamily Q, member 4 342 SNX9 chr6: 158244202- 29.3953.33 0.86 1.54 5.00E−05 7.97E−04 sorting nexin 9 158366109 343 CABLES1chr18: 20714527- 3.44 6.25 0.86 1.27 5.00E−05 7.97E−04 Cdk5 and Ablenzyme substrate 1 20840434 344 GGT1 chr22: 24979717- 8.14 14.77 0.861.04 9.00E−04 1.01E−02 gamma-glutamyltransferase 1 25024972 345 DDX60Lchr4: 169277885- 7.16 12.99 0.86 1.58 5.00E−05 7.97E−04 DEAD(Asp-Glu-Ala-Asp) box 169401665 polypeptide 60-like 346 SLC16A3 chr17:80186281- 57.19 103.38 0.85 1.49 5.00E−05 7.97E−04 solute carrier family16 80197375 (monocarboxylate transporter), member 3 347 WWC2 chr4:184020462- 0.61 1.10 0.85 0.97 7.00E−04 8.19E−03 WW and C2 domaincontaining 2 184241929 348 MAP2K3 chr17: 21187967- 30.45 54.88 0.85 1.505.00E−05 7.97E−04 mitogen-activated protein kinase 21218551 kinase 3 349IFIT2 chr10: 91061705- 2.39 4.30 0.85 1.18 5.00E−05 7.97E−04interferon-induced protein with 91069033 tetratricopeptide repeats 2 350ERAP2 chr5: 96211643- 8.96 16.14 0.85 1.26 5.00E−05 7.97E−04 endoplasmicreticulum 96255406 aminopeptidase 2 351 CALU chr7: 128379345- 12.7923.01 0.85 1.56 5.00E−05 7.97E−04 calumenin 128415844 352 PFKFB3 chr10:6186842- 5.30 9.53 0.85 1.33 5.00E−05 7.97E−04 6-phosphofructo-2-kinase/6277507 fructose-2,6-biphosphatase 3 353 SNHG5 chr6: 86386724- 96.63173.62 0.85 1.32 5.00E−05 7.97E−04 small nucleolar RNA host gene 586388451 354 GCNT3 chr15: 59903981- 204.61 367.58 0.85 1.22 5.00E−057.97E−04 glucosaminyl (N-acetyl) 59912210 transferase 3, mucin type 355NUP62CL chrX: 106366656- 1.59 2.85 0.84 0.83 6.30E−03 4.91E−02nucleoporin 62 kDa C-terminal 106449670 like 356 CD276 chr15: 73976621-10.23 18.36 0.84 1.40 5.00E−05 7.97E−04 CD276 molecule 74006859 357SLCO4A1 chr20: 61273796- 3.44 6.15 0.84 0.97 8.00E−04 9.10E−03 solutecarrier organic anion 61303647 transporter family, member 4A1 358COL17A1 chr10: 105791045- 57.93 103.64 0.84 1.31 5.00E−05 7.97E−04collagen, type XVII, alpha 1 105845638 359 ABTB1 chr3: 127391780- 12.6422.53 0.83 1.33 5.00E−05 7.97E−04 ankyrin repeat and BTB (POZ) 127399769domain containing 1 360 TNFRSF1B chr1: 12226999- 11.28 20.07 0.83 1.435.00E−05 7.97E−04 tumor necrosis factor receptor 12269277 superfamily,member 1B 361 GNPNAT1 chr14: 53241910- 18.14 32.26 0.83 1.57 5.00E−057.97E−04 glucosamine-phosphate N- 53258386 acetyltransferase 1 362 ICA1chr7: 8152814- 13.90 24.68 0.83 1.41 5.00E−05 7.97E−04 islet cellautoantigen 1, 69 kDa 8302242 363 RRAS chr19: 50138551- 41.18 73.08 0.831.41 5.00E−05 7.97E−04 related RAS viral (r-ras) 50143400 oncogenehomolog 364 TRANK1 chr3: 36868307- 13.30 23.57 0.83 1.43 5.00E−057.97E−04 tetratricopeptide repeat and 36986548 ankyrin repeat containing1 365 NFE2L3 chr7: 26191846- 6.13 10.86 0.82 1.35 5.00E−05 7.97E−04nuclear factor, erythroid 2- 26226756 like 3 366 STXBP6 chr14: 25281305-6.42 11.35 0.82 1.14 1.00E−04 1.51E−03 syntaxin binding protein 625519095 (amisyn) 367 ACPP chr3: 132036210- 2.96 5.21 0.82 0.92 1.75E−031.74E−02 acid phosphatase, prostate 132087146 368 LOC100288778 chr12:87983- 3.19 5.63 0.82 0.82 1.65E−03 1.66E−02 WAS protein family homolog1 91263 pseudogene 369 SC5D chr11: 121163387- 6.78 11.94 0.82 1.435.00E−05 7.97E−04 sterol-C5-desaturase 121184119 370 CYP51A1 chr7:91741462- 48.27 84.95 0.82 1.38 5.00E−05 7.97E−04 cytochrome P450,family 51, 91764059 subfamily A, polypeptide 1 371 GLMP chr1: 156262477-21.65 38.09 0.81 1.30 5.00E−05 7.97E−04 glycosylated lysosomal membrane156265480 protein 372 SOCS3 chr17: 76352857- 3.69 6.48 0.81 1.028.50E−04 9.59E−03 suppressor of cytokine 76356160 signaling 3 373 EHD1chr11: 64620198- 18.85 33.13 0.81 1.47 5.00E−05 7.97E−04 EH-domaincontaining 1 64647185 374 KLF2 chr19: 16435650- 9.99 17.55 0.81 1.143.50E−04 4.55E−03 Kruppel-like factor 2 16438339 375 LIF chr22:30636435- 4.62 8.11 0.81 1.23 5.00E−05 7.97E−04 leukemia inhibitoryfactor 30642840 376 PLS3 chrX: 114752496- 9.47 16.60 0.81 1.45 5.00E−057.97E−04 plastin 3 114885179 377 HS3ST1 chr4: 11399987- 3.42 6.00 0.810.93 1.00E−03 1.10E−02 heparan sulfate (glucosamine) 3- 11430537O-sulfotransferase 1 378 SLC17A9 chr20: 61583998- 10.48 18.37 0.81 1.275.00E−05 7.97E−04 solute carrier family 17 61599949 (vesicularnucleotide transporter), member 9 379 SCG5 chr15: 32933869- 4.74 8.300.81 0.92 9.50E−04 1.05E−02 secretogranin V 32989298 380 C1GALT1 chr7:7222245- 6.53 11.43 0.81 1.48 5.00E−05 7.97E−04 core 1 synthase,glycoprotein-N- 7288280 acetylgalactosamine 3-beta-galactosyltransferase 1 381 BOK chr2: 242483800- 5.68 9.94 0.81 1.181.50E−04 2.18E−03 BCL2-related ovarian killer 242513553 382 SHB chr9:37915894- 3.58 6.27 0.81 1.27 5.00E−05 7.97E−04 Src homology 2 domaincontaining 38069210 adaptor protein B 383 PDLIM7 chr5: 176910394- 11.6620.39 0.81 1.12 2.00E−04 2.80E−03 PDZ and LIM domain 7 (enigma)176924606 384 P4HB chr17: 79801033- 318.06 556.16 0.81 1.09 5.00E−057.97E−04 prolyl 4-hydroxylase, beta 79818544 polypeptide 385 ANKRD22chr10: 90562486- 5.76 10.06 0.81 1.41 5.00E−05 7.97E−04 ankyrin repeatdomain 22 90611732 386 INSIG2 chr2: 118846049- 8.45 14.76 0.80 1.375.00E−05 7.97E−04 insulin induced gene 2 118867597 387 GABRE chrX:151121595- 7.51 13.07 0.80 1.25 5.00E−05 7.97E−04 gamma-aminobutyricacid (GABA) 151143151 A receptor, epsilon 388 TXNDC17 chr17: 6481644-23.91 41.54 0.80 1.19 5.00E−05 7.97E−04 thioredoxin domain 6554954containing 17 389 GAN chr16: 81348570- 1.00 1.73 0.80 0.92 1.15E−031.24E−02 gigaxonin 81413803 390 MST1R chr3: 49924435- 31.40 54.48 0.791.41 5.00E−05 7.97E−04 macrophage stimulating 1 49941306 receptor 391ITGB6 chr2: 160956176- 11.17 19.32 0.79 1.40 5.00E−05 7.97E−04 integrin,beta 6 161056824 392 MFSD2A chr1: 40420783- 10.01 17.30 0.79 1.175.00E−05 7.97E−04 major facilitator superfamily 40435640 domaincontaining 2A 393 FUT8 chr14: 65877309- 6.81 11.75 0.79 1.34 5.00E−057.97E−04 fucosyltransferase 8 (alpha 66210839 (1,6) fucosyltransferase)394 RIMS3 chr1: 41086351- 3.48 6.01 0.79 1.27 5.00E−05 7.97E−04regulating synaptic membrane 41131324 exocytosis 3 395 SERPINA1 chr14:94843083- 48.63 83.86 0.79 1.28 5.00E−05 7.97E−04 serpin peptidaseinhibitor, clade 94857029 A (alpha-1 antiproteinase, antitrypsin),member 1 396 SMURF2 chr17: 62540734- 4.42 7.63 0.79 1.22 5.00E−057.97E−04 SMAD specific E3 ubiquitin 62658386 protein ligase 2 397 TMEM61chr1: 55446464- 6.34 10.89 0.78 0.86 3.15E−03 2.79E−02 transmembraneprotein 61 55457966 398 SH3D21 chr1: 36771993- 10.24 17.59 0.78 1.275.00E−05 7.97E−04 SH3 domain containing 21 36786948 399 OAS3 chr12:113376248- 8.44 14.50 0.78 1.28 5.00E−05 7.97E−04 2′-5′-oligoadenylate113411054 synthetase 3, 100 kDa 400 CBLB chr3: 105377108- 6.14 10.530.78 1.31 5.00E−05 7.97E−04 Cbl proto-oncogene B, E3 105587887 ubiquitinprotein ligase 401 LOC101927391 chr7: 7589734- 0.95 1.62 0.78 0.825.75E−03 4.55E−02 uncharacterized LOC101927391 7605696 402 PFKFB4 chr3:48555116- 3.33 5.70 0.78 1.07 2.50E−04 3.38E−036-phosphofructo-2-kinase/fructose- 48594227 2,6-biphosphatase 4 403KIAA1551 chr12: 32112352- 5.70 9.75 0.77 1.40 5.00E−05 7.97E−04 KIAA155132146043 404 STXBP1 chr9: 130374485- 2.15 3.68 0.77 1.00 5.00E−046.13E−03 syntaxin binding protein 1 130454995 405 DFNB31 chr9:117164359- 2.57 4.40 0.77 0.97 1.60E−03 1.62E−02 deafness, autosomalrecessive 31 117267736 406 VWA7 chr6_ssto_hap7: 2.21 3.78 0.77 0.921.45E−03 1.49E−02 von Willebrand factor A domain 3064183- containing 73075920 407 IL18 chr11: 112013973- 39.19 66.93 0.77 1.41 5.00E−057.97E−04 interleukin 18 112034840 408 RNF149 chr2: 101892062- 18.5831.64 0.77 1.42 5.00E−05 7.97E−04 ring finger protein 149 101925178 409LACTB2 chr8: 71520811- 13.48 22.91 0.77 1.27 5.00E−05 7.97E−04lactamase, beta 2 71581447 410 DPY19L1 chr7: 34968492- 7.14 12.13 0.761.35 5.00E−05 7.97E−04 dpy-19-like 1 (C. elegans) 35077653 411 PLIN3chr19: 4838345- 49.01 83.23 0.76 1.35 5.00E−05 7.97E−04 perilipin 34867780 412 STX18 chr4: 4387982- 14.44 24.52 0.76 1.25 5.00E−05 7.97E−04syntaxin 18 4543775 413 ALS2CL chr3: 46710484- 20.13 34.15 0.76 1.275.00E−05 7.97E−04 ALS2 C-terminal like 46735194 414 REEP6 chr19:1491164- 3.65 6.18 0.76 0.83 3.55E−03 3.07E−02 receptor accessoryprotein 6 1497924 415 SERPINB1 chr6: 2832565- 95.42 161.49 0.76 1.335.00E−05 7.97E−04 serpin peptidase inhibitor, clade B 2842283(ovalbumin), member 1 416 AFAP1L2 chr10: 116054582- 5.04 8.52 0.76 1.215.00E−05 7.97E−04 actin filament associated protein 1- 116164537 like 2417 BIK chr22: 43506753- 10.12 17.11 0.76 0.88 1.60E−03 1.62E−02BCL2-interacting killer (apoptosis- 43525718 inducing) 418 CCDC68 chr18:52568739- 17.33 29.29 0.76 1.44 5.00E−05 7.97E−04 coiled-coil domaincontaining 68 52626739 419 GEM chr8: 95261484- 2.04 3.45 0.75 0.823.30E−03 2.90E−02 GTP binding protein 95274547 overexpressed in skeletalmuscle 420 MCFD2 chr2: 47129008- 12.30 20.72 0.75 1.01 6.50E−04 7.69E−03multiple coagulation factor 47303275 deficiency 2 421 TCP11L2 chr12:106696568- 4.71 7.93 0.75 1.02 5.00E−05 7.97E−04 t-complex 11,testis-specific-like 2 106740792 422 NOSTRIN chr2: 169643048- 12.7121.39 0.75 1.33 1.00E−04 1.51E−03 nitric oxide synthase trafficking169721849 423 RARG chr12: 53604349- 4.98 8.36 0.75 1.05 1.00E−041.51E−03 retinoic acid receptor, gamma 53626040 424 ST6GALNAC4 chr9:130670164- 15.43 25.90 0.75 1.17 5.00E−05 7.97E−04 ST6(alpha-N-acetyl-neuraminyl- 130679305 2,3-beta-galactosyl-1,3)-N-acetylgalactosaminide alpha-2,6- sialyltransferase 4 425 HIST1H1C chr6:26055967- 67.62 113.43 0.75 1.24 5.00E−05 7.97E−04 histone cluster 1,H1c 26056699 426 FRY chr13: 32598195- 0.85 1.42 0.75 0.97 5.00E−046.13E−03 furry homolog (Drosophila) 5-32870776 427 FAM109B chr22:42470254- 7.37 12.35 0.75 1.11 5.00E−05 7.97E−04 family with sequencesimilarity 42475442 109, member B 428 TMEM140 chr7: 134832765- 9.0315.13 0.74 0.85 2.60E−03 2.41E−02 transmembrane protein 140 134855578429 GK chrX: 30671475- 12.29 20.57 0.74 1.39 5.00E−05 7.97E−04 glycerolkinase 30749577 430 CREB3L2 chr7: 137559724- 12.24 20.48 0.74 1.215.00E−05 7.97E−04 cAMP responsive element binding 137686847 protein3-like 2 431 ACOT9 chrX: 23721776- 14.34 23.98 0.74 1.22 5.00E−057.97E−04 acyl-CoA thioesterase 9 23761407 432 RBPMS chr8: 30239634-11.58 19.36 0.74 0.99 8.00E−04 9.10E−03 RNA binding protein withmultiple 30429778 splicing 433 GDAP1 chr8: 75262617- 1.13 1.89 0.74 0.863.05E−03 2.72E−02 ganglioside induced differentiation 75279335associated protein 1 434 BCO1 chr16: 81272295- 1.66 2.78 0.74 0.783.00E−03 2.69E−02 beta-carotene oxygenase 1 81324747 435 GALK2 chr15:49447955- 7.84 13.10 0.74 1.18 5.00E−05 7.97E−04 galactokinase 249913118 436 GSTO2 chr10: 106028630- 10.13 16.90 0.74 0.92 1.05E−031.15E−02 glutathione S-transferase omega 2 106059176 437 CEP85 chr1:26560643- 5.56 9.27 0.74 1.01 7.50E−04 8.64E−03 centrosomal protein 85kDa 26605529 438 ETV5 chr3: 185764105- 1.92 3.19 0.74 0.92 1.05E−031.15E−02 ets variant 5 185826901 439 SLC45A4 chr8: 142217264- 6.10 10.170.74 1.15 5.00E−05 7.97E−04 solute carrier family 45, member 4 142264728440 FDPS chr1: 155278538- 104.25 173.68 0.74 1.00 7.50E−04 8.64E−03farnesyl diphosphate synthase 155300909 441 BACH1 chr21: 30671219- 6.6311.04 0.73 1.28 5.00E−05 7.97E−04 BTB and CNC homology 1, basic 30734217leucine zipper transcription factor 1 442 KIAA1217 chr10: 23983674-11.97 19.92 0.73 1.12 5.00E−05 7.97E−04 KIAA1217 24836777 443 MAOB chrX:43625856- 5.58 9.27 0.73 1.08 5.00E−05 7.97E−04 monoamine oxidase B43741721 444 SPRY4 chr5: 141689991- 5.11 8.47 0.73 1.15 1.00E−041.51E−03 sprouty RTK signaling antagonist 4 141704620 445 IL7R chr5:35856976- 5.04 8.34 0.73 1.15 1.00E−04 1.51E−03 interleukin 7 receptor35879705 446 YIPF5 chr5: 143537722- 16.40 27.15 0.73 1.39 5.00E−057.97E−04 Yip1 domain family, member 5 143550278 447 MMP14 chr14:23305741- 27.80 45.99 0.73 1.28 5.00E−05 7.97E−04 matrixmetallopeptidase 14 23316808 (membrane-inserted) 448 RASEF chr9:85594499- 15.99 26.46 0.73 1.28 5.00E−05 7.97E−04 RAS and EF-hand domain85678043 containing 449 GLRX chr5: 95149552- 51.86 85.78 0.73 1.295.00E−05 7.97E−04 glutaredoxin (thioltransferase) 95158577 450 FAXDC2chr5: 154198051- 5.97 9.86 0.73 1.11 1.50E−04 2.18E−03 fatty acidhydroxylase domain 154230213 containing 2 451 SMIM3 chr5: 150157507-7.36 12.17 0.73 1.06 1.00E−04 1.51E−03 small integral membrane protein 3150176298 452 YPEL3 chr16: 30103634- 35.25 58.05 0.72 1.15 5.00E−057.97E−04 yippee-like 3 30107537 453 TINAGL1 chr1: 32042085- 59.52 98.000.72 1.28 5.00E−05 7.97E−04 tubulointerstitial nephritis antigen-32053287 like 1 454 CRYZ chr1: 75171171- 8.39 13.77 0.72 0.83 4.60E−033.78E−02 crystallin, zeta (quinone reductase) 75232360 455 SRXN1 chr20:627267- 11.46 18.81 0.71 1.18 5.00E−05 7.97E−04 sulfiredoxin 1 634014456 RASA4 chr7: 102220092- 4.62 7.58 0.71 1.17 5.00E−05 7.97E−04 RAS p21protein activator 4 102257205 457 RSPH1 chr21: 43892596- 3.87 6.33 0.710.80 3.50E−03 3.04E−02 radial spoke head 1 homolog 43916464(Chlamydomonas) 458 ZNF292 chr6: 87865268- 3.26 5.33 0.71 1.26 5.00E−057.97E−04 zinc finger protein 292 87973406 459 LRP10 chr14: 23340959-124.41 202.86 0.71 1.14 1.00E−04 1.51E−03 low density lipoproteinreceptor- 23347291 related protein 10 460 CAPN8 chr1: 223714971- 53.8287.75 0.71 1.22 5.00E−05 7.97E−04 calpain 8 223853436 461 LOC146880chr17: 62745779- 21.87 35.64 0.70 1.25 5.00E−05 7.97E−04 Rho GTPaseactivating protein 27 62778117 pseudogene 462 TMEM263 chr12: 107349543-10.60 17.27 0.70 1.28 5.00E−05 7.97E−04 transmembrane protein 263107367813 463 BBS12 chr4: 123653856- 1.70 2.77 0.70 0.81 4.25E−033.56E−02 Bardet-Biedl syndrome 12 123666098 464 RAPH1 chr2: 204298404-5.04 8.20 0.70 1.16 5.00E−05 7.97E−04 Ras association (RalGDS/AF-6)204400058 and pleckstrin homology domains 1 465 TANK chr2: 161993465-19.25 31.30 0.70 1.15 1.00E−04 1.51E−03 TRAF family member-associated162111154 NFKB activator 466 SLC30A7 chr1: 101361631- 5.81 9.44 0.701.29 5.00E−05 7.97E−04 solute carrier family 30 (zinc 101447311transporter), member 7 467 NID2 chr14: 52471519- 2.46 4.00 0.70 0.963.50E−04 4.55E−03 nidogen 2 (osteonidogen) 52535946 468 PTPN12 chr7:77166772- 12.85 20.86 0.70 1.29 5.00E−05 7.97E−04 protein tyrosinephosphatase, non- 77269388 receptor type 12 469 ABHD11-AS1 chr7:73149398- 32.52 52.76 0.70 0.91 1.25E−03 1.33E−02 ABHD11 antisense RNA 1(tail to 73150330 tail) 470 SEMA4B chr15: 90728151- 36.01 58.43 0.701.28 5.00E−05 7.97E−04 sema domain, immunoglobulin 90772892 domain (Ig),transmembrane domain (TM) and short cytoplasmic domain, (semaphorin) 4B471 GTF2E2 chr8: 30436030- 14.84 24.09 0.70 1.17 5.00E−05 7.97E−04general transcription factor IIE, 30515738 polypeptide 2, beta 34 kDa472 CYTH2 chr19: 48972464- 17.54 28.41 0.70 1.27 5.00E−05 7.97E−04cytohesin 2 48985571 473 TSPAN1 chr1: 46640748- 1044.92 1691.96 0.700.77 4.55E−03 3.75E−02 tetraspanin 1 46651634 474 SEC13 chr3: 10342612-68.75 111.28 0.69 1.28 5.00E−05 7.97E−04 SEC13 homolog, nuclear pore and10362872 COPII coat complex component 475 DYNLT3 chrX: 37698088- 16.5926.79 0.69 1.30 5.00E−05 7.97E−04 dynein, light chain, Tctex-type 337706889 476 INSIG1 chr7: 155089485- 26.44 42.70 0.69 1.11 5.00E−057.97E−04 insulin induced gene 1 155101945 477 DVL1 chr1: 1270657- 21.3134.39 0.69 1.29 5.00E−05 7.97E−04 dishevelled segment polarity 1284492protein 1 478 CNKSR3 chr6: 154726432- 8.64 13.93 0.69 1.15 5.00E−057.97E−04 CNKSR family member 3 154831753 479 DCUN1D5 chr11: 102921412-0.92 1.49 0.69 1.08 4.00E−04 5.06E−03 DCN1, defective in cullin102962944 neddylation 1, domain containing 5 480 PRKY chrY: 7142012-1.33 2.15 0.69 0.78 6.20E−03 4.85E−02 protein kinase, Y-linked, 7249588pseudogene 481 OSBPL8 chr12: 76745577- 4.59 7.38 0.68 1.18 5.00E−057.97E−04 oxysterol binding protein-like 8 76953589 482 CAPG chr2:85621870- 77.07 123.87 0.68 1.25 5.00E−05 7.97E−04 capping protein(actin filament), 85641197 gelsolin-like 483 ETV3 chr1: 157094458- 4.176.69 0.68 0.86 2.40E−03 2.25E−02 ets variant 3 157108383 484 FBXO6 chr1:11724149- 5.04 8.09 0.68 0.80 4.85E−03 3.96E−02 F-box protein 6 11734409485 TNFRSF10A chr8: 23048969- 4.04 6.47 0.68 0.84 9.50E−04 1.05E−02tumor necrosis factor receptor 23082680 superfamily, member 10a 486MYDGF chr19: 4657556- 90.58 144.76 0.68 1.22 5.00E−05 7.97E−04myeloid-derived growth factor 4670415 487 TGM2 chr20: 36756863- 10.3416.51 0.68 1.08 5.00E−05 7.97E−04 transglutaminase 2 36793700 488 ZFYVE1chr14: 73436152- 5.78 9.24 0.68 0.96 7.50E−04 8.64E−03 zinc finger, FYVEdomain 73493920 containing 1 489 HERC4 chr10: 69681655- 15.20 24.25 0.671.25 5.00E−05 7.97E−04 HECT and RLD domain containing 69835103 E3ubiquitin protein ligase 4 490 IL15RA chr10: 5994333- 6.27 10.00 0.670.88 2.65E−03 2.44E−02 interleukin 15 receptor, alpha 6020150 491 CKLFchr16: 66586465- 27.10 43.23 0.67 0.96 8.50E−04 9.59E−03 chemokine-likefactor 66613038 492 PCBP4 chr3: 51989329- 8.08 12.89 0.67 0.97 9.00E−041.01E−02 poly(rC) binding protein 4 52001482 493 ARHGEF28 chr5:72921982- 2.05 3.26 0.67 0.96 5.00E−04 6.13E−03 Rho guanine nucleotideexchange 73237818 factor (GEF) 28 494 CD2 chr1: 117297085- 6.39 10.180.67 0.92 1.25E−03 1.33E−02 CD2 molecule 117311851 495 STX19 chr3:93698982- 11.60 18.49 0.67 1.01 9.50E−04 1.05E−02 syntaxin 19 93774522496 LYST chr1: 235824330- 2.89 4.60 0.67 0.93 2.40E−03 2.25E−02lysosomal trafficking regulator 236047008 497 RPL37 chr5: 40831429-230.43 366.73 0.67 1.13 3.00E−04 3.97E−03 ribosomal protein L37 40835387498 BZW1 chr2: 201560445- 99.09 157.57 0.67 1.15 5.00E−05 7.97E−04 basicleucine zipper and W2 201688569 domains 1 499 EEPD1 chr7: 36192835- 2.964.71 0.67 0.93 6.50E−04 7.69E−03 endonuclease/exonuclease/ 36341152phosphatase family domain containing 1 500 PPA1 chr10: 71962585- 104.60166.05 0.67 1.26 5.00E−05 7.97E−04 pyrophosphatase (inorganic) 171993190 501 S100A16 chr1: 153579366- 274.09 434.63 0.67 1.14 5.00E−057.97E−04 S100 calcium binding protein A16 153585514 502 RIN1 chr11:66099541- 4.90 7.76 0.66 0.90 2.15E−03 2.06E−02 Ras and Rab interactor 166104000 503 IFI44 chr1: 79115476- 3.07 4.85 0.66 0.77 6.40E−03 4.97E−02interferon-induced protein 44 79129763 504 ECT2 chr3: 172468474- 8.0512.74 0.66 1.18 5.00E−05 7.97E−04 epithelial cell transforming 2172539264 505 RNASE4 chr14: 21152335- 59.94 94.78 0.66 1.04 5.00E−057.97E−04 ribonuclease, RNase A family, 4 21168761 506 CD247 chr1:167399876- 3.79 5.99 0.66 0.75 5.20E−03 4.20E−02 CD247 molecule167487847 507 AGR3 chr7: 16899029- 132.11 208.74 0.66 1.24 5.00E−057.97E−04 anterior gradient 3, protein 16921613 disulphide isomerasefamily member 508 KDSR chr18: 60994970- 6.40 10.11 0.66 1.19 5.00E−057.97E−04 3-ketodihydrosphingosine 61034506 reductase 509 AP3S1 chr5:115177618- 49.87 78.68 0.66 1.25 5.00E−05 7.97E−04 adaptor-relatedprotein complex 3, 115249778 sigma 1 subunit 510 TMED9 chr5: 177019212-115.33 181.81 0.66 1.18 5.00E−05 7.97E−04 transmembrane p24 trafficking177023099 protein 9 511 SLC37A1 chr21: 43919741- 27.45 43.28 0.66 1.175.00E−05 7.97E−04 solute carrier family 37 (glucose-6- 44001550phosphate transporter), member 1 512 NTHL1 chr16: 2089815- 11.06 17.410.66 0.84 2.90E−03 2.62E−02 nth-like DNA glycosylase 1 2097867 513IFNLR1 chr1: 24480646- 4.41 6.94 0.65 1.02 2.50E−04 3.38E−03 interferon,lambda receptor 1 24513765 514 B4GALT6 chr18: 29202208- 1.46 2.30 0.650.85 3.85E−03 3.28E−02 UDP-Gal: betaGlcNAc beta 1,4- 29264686galactosyltransferase, polypeptide 6 515 TFF3 chr21: 43731776- 685.721079.30 0.65 0.81 3.45E−03 3.01E−02 trefoil factor 3 (intestinal)43735706 516 TC2N chr14: 92246095- 17.32 27.25 0.65 1.27 5.00E−057.97E−04 tandem C2 domains, nuclear 92333880 517 CTNNAL1 chr9:111704848- 5.08 7.99 0.65 1.01 3.50E−04 4.55E−03 catenin(cadherin-associated 111775874 protein), alpha-like 1 518 DNAJC10 chr2:183580767- 17.97 28.23 0.65 1.19 5.00E−05 7.97E−04 DnaJ (Hsp40) homolog,subfamily 183644750 C, member 10 519 DEDD2 chr19: 42702744- 24.45 38.370.65 1.12 3.50E−04 4.55E−03 death effector domain containing 2 42724304520 SH2D3A chr19: 6752172- 18.47 28.99 0.65 1.16 1.00E−04 1.51E−03 SH2domain containing 3A 6767523 521 SERAC1 chr6: 158530535- 2.77 4.34 0.650.87 1.40E−03 1.45E−02 serine active site containing 1 158589312 522TMED3 chr15: 79603490- 51.61 80.91 0.65 1.16 5.00E−05 7.97E−04transmembrane p24 trafficking 79615189 protein 3 523 THY1 chr11:119252487- 6.57 10.30 0.65 0.90 2.30E−03 2.18E−02 Thy-1 cell surfaceantigen 119369944 524 TOR4A chr9: 140172279- 22.28 34.93 0.65 1.221.00E−04 1.51E−03 torsin family 4, member A 140177093 525 GDA chr9:74729510- 18.87 29.56 0.65 1.19 5.00E−05 7.97E−04 guanine deaminase74867140 526 HELZ2 chr20: 62189438- 4.77 7.47 0.65 1.09 2.50E−043.38E−03 helicase with zinc finger 2, 62205592 transcriptionalcoactivator 527 AHR chr7: 17338275- 9.54 14.95 0.65 1.14 5.00E−057.97E−04 aryl hydrocarbon receptor 17385775 528 OCIAD2 chr4: 48887396-131.44 205.76 0.65 1.17 5.00E−05 7.97E−04 OCIA domain containing 248908845 529 CCL5 chr17: 34198495- 15.82 24.76 0.65 0.98 1.50E−042.18E−03 chemokine (C-C motif) ligand 5 34207377 530 SEC14L1 chr17:7508472 12.89 20.18 0.65 1.03 3.00E−04 3.97E−03 SEC14-like lipid binding1 75213181 531 S100A14 chr1: 153586731- 370.73 579.89 0.65 1.02 1.00E−041.51E−03 S100 calcium binding protein A14 153588808 532 EMP1 chr12:13349601- 171.36 267.93 0.64 0.94 7.50E−04 8.64E−03 epithelial membraneprotein 1 13369708 533 SQLE chr8: 126010719- 24.28 37.93 0.64 1.012.50E−04 3.38E−03 squalene epoxidase 126034525 534 ATG4A chrX:107334898- 12.72 19.85 0.64 1.14 1.00E−04 1.51E−03 autophagy related 4A,cysteine 107397901 peptidase 535 TNK2 chr3: 195590235- 10.38 16.18 0.641.16 5.00E−05 7.97E−04 tyrosine kinase, non-receptor, 2 195635880 536KIAA0040 chr1: 175126122- 6.97 10.86 0.64 1.09 2.00E−04 2.80E−03KIAA0040 175162229 537 FOXP4 chr6: 41514163- 8.28 12.91 0.64 1.132.00E−04 2.80E−03 forkhead box P4 41570122 538 HM13 chr20: 30102212-69.16 107.78 0.64 1.15 1.00E−04 1.51E−03 histocompatibility (minor) 1330161066 539 PMP22 chr17: 15133093- 27.45 42.76 0.64 1.11 1.00E−041.51E−03 peripheral myelin protein 22 15168674 540 KCNE3 chr11:74165885- 9.33 14.53 0.64 1.09 5.00E−05 7.97E−04 potassium channel,voltage gated 74178600 subfamily E regulatory beta subunit 3 541 TMEM2chr9: 74298281- 19.87 30.91 0.64 1.14 5.00E−05 7.97E−04 transmembraneprotein 2 74383800 542 SUCO chr1: 172501488- 6.49 10.09 0.64 1.082.00E−04 2.80E−03 SUN domain containing 172580975 ossification factor543 GPCPD1 chr20: 5525079- 9.60 14.93 0.64 1.20 5.00E−05 7.97E−04glycerophosphocholine 5591672 phosphodiesterase 1 544 CDHR2 chr5:175969511- 61.21 95.10 0.64 1.06 3.50E−04 4.55E−03 cadherin-relatedfamily member 2 176022769 545 LRCH1 chr13: 47127295- 3.75 5.82 0.63 0.994.00E−04 5.06E−03 leucine-rich repeats and calponin 47327175 homology(CH) domain containing 1 546 IFNGR1 chr6: 137518620- 59.23 91.94 0.631.20 5.00E−05 7.97E−04 interferon gamma receptor 1 137540567 547 ARRDC2chr19: 18111940- 15.23 23.65 0.63 1.09 1.00E−04 1.51E−03 arrestin domaincontaining 2 18124911 548 MID1IP1 chrX: 38660500- 18.29 28.38 0.63 1.004.00E−04 5.06E−03 MIDI interacting protein 1 38665783 549 FA2H chr16:74746855- 28.66 44.45 0.63 1.14 5.00E−05 7.97E−04 fatty acid2-hydroxylase 74808729 550 RIPK4 chr21: 43159528- 5.38 8.34 0.63 0.974.00E−04 5.06E−03 receptor-interacting serine- 43187249 threonine kinase4 551 RHPN1 chr8: 144451024- 2.06 3.19 0.63 0.78 5.60E−03 4.46E−02rhophilin, Rho GTPase binding 144466390 protein 1 552 RAB3D chr19:11406814- 5.49 8.50 0.63 1.00 4.00E−04 5.06E−03 RAB3D, member RASoncogene 11450344 family 553 ALDH1A3 chr15: 101420008- 2.98 4.62 0.630.85 2.05E−03 1.98E−02 aldehyde dehydrogenase 1 family, 101456830 memberA3 554 HLA-F chr6_ssto_hap7: 64.83 100.33 0.63 1.10 5.00E−05 7.97E−04major histocompatibility complex, 1028896- class I, F 1054576 555 TCF12chr15: 57210832- 12.44 19.24 0.63 1.10 1.50E−04 2.18E−03 transcriptionfactor 12 57580714 556 B4GALT2 chr1: 44444873- 14.91 23.06 0.63 1.002.50E−04 3.38E−03 UDP-Gal: betaGlcNAc beta 1,4- 44456843galactosyltransferase, polypeptide 2 557 STIM2 chr4: 26862312- 5.64 8.720.63 1.08 1.50E−04 2.18E−03 stromal interaction molecule 2 27027003 558ALCAM chr3: 105085556- 5.05 7.80 0.63 1.06 5.00E−05 7.97E−04 activatedleukocyte cell adhesion 105295757 molecule 559 C6orf132 chr6: 42068856-5.22 8.04 0.62 1.07 5.00E−05 7.97E−04 chromosome 6 open reading frame42110715 132 560 BTNL3 chr5: 180415844- 45.84 70.52 0.62 1.06 1.50E−042.18E−03 butyrophilin-like 3 180433727 561 ANXA5 chr4: 122589151- 78.32120.46 0.62 1.13 5.00E−05 7.97E−04 annexin A5 122618147 562 HYAL2 chr3:50355220- 12.04 18.51 0.62 0.99 6.50E−04 7.69E−03hyaluronoglucosaminidase 2 50360281 563 TRERF1 chr6: 42192668- 2.06 3.170.62 0.91 5.00E−04 6.13E−03 transcriptional regulating factor 1 42419783564 P4HA1 chr10: 74766979- 12.34 18.96 0.62 1.09 1.00E−04 1.51E−03prolyl 4-hydroxylase, alpha 74856732 polypeptide I 565 LATS2 chr13:21547175- 2.10 3.22 0.62 0.84 2.05E−03 1.98E−02 large tumor suppressorkinase 2 21635722 566 UBE2L6 chr11: 57319127- 23.06 35.43 0.62 0.986.00E−04 7.22E−03 ubiquitin-conjugating enzyme E2L 57335803 6 567 SOWAHCchr2: 110371910- 12.09 18.58 0.62 1.12 5.00E−05 7.97E−04 sosondowahankyrin repeat 110376564 domain family member C 568 PALD1 chr10:72238563- 1.98 3.03 0.62 0.82 3.80E−03 3.25E−02 phosphatase domaincontaining, 72328206 paladin 1 569 C4orf32 chr4: 113066552- 3.49 5.350.62 0.82 5.40E−03 4.34E−02 chromosome 4 open reading frame 113110237 32570 UBA6 chr4: 68481478- 6.70 10.29 0.62 1.17 5.00E−05 7.97E−04ubiquitin-like modifier activating 68566889 enzyme 6 571 PFKP chr10:3109711- 35.94 55.13 0.62 1.12 5.00E−05 7.97E−04 phosphofructokinase,platelet 3178997 572 OSGIN1 chr16: 83986826- 6.30 9.66 0.62 0.823.95E−03 3.35E−02 oxidative stress induced growth 83999937 inhibitor 1573 TJP1 chr15: 29992356- 11.11 17.04 0.62 1.11 1.00E−04 1.51E−03 tightjunction protein 1 30114706 574 TUBA4A chr2: 220110191- 32.10 49.18 0.620.93 7.50E−04 8.64E−03 tubulin, alpha 4a 220136910 575 ZNFX1 chr20:47862438- 9.19 14.08 0.61 0.88 1.60E−03 1.62E−02 zinc finger, NFX1-type47905795 containing 1 576 ITGA1 chr5: 52083773- 5.29 8.09 0.61 0.882.65E−03 2.44E−02 integrin, alpha 1 52249485 577 IL20RA chr6: 137321107-5.72 8.75 0.61 0.98 9.00E−04 1.01E−02 interleukin 20 receptor, alpha137366317 578 MARCKSL1 chr1: 32799429- 122.79 187.82 0.61 1.12 5.00E−057.97E−04 MARCKS-like 1 32801840 579 GALNT3 chr2: 166604312- 39.67 60.670.61 1.13 1.00E−04 1.51E−03 polypeptide N- 166650803acetylgalactosaminyltransferase 3 580 F3 chr1: 94994731- 28.22 43.160.61 1.09 1.50E−04 2.18E−03 coagulation factor III 95007413(thromboplastin, tissue factor) 581 PARD6B chr20: 49348080- 3.14 4.800.61 0.90 1.55E−03 1.57E−02 par-6 family cell polarity regulator49370278 beta 582 PPP2R2A chr8: 26149006- 13.21 20.17 0.61 1.19 5.00E−057.97E−04 protein phosphatase 2, regulatory 26230195 subunit B, alpha 583CASP10 chr2: 202047620- 10.18 15.54 0.61 1.08 1.50E−04 2.18E−03 caspase10, apoptosis-related 202094129 cysteine peptidase 584 PIM3 chr22:50354142- 26.30 40.16 0.61 1.10 5.00E−05 7.97E−04 Pim-3 proto-oncogene,50357720 serine/threonine kinase 585 LAMB3 chr1: 209788217- 41.79 63.790.61 1.09 5.00E−05 7.97E−04 laminin, beta 3 209825820 586 SLC35C1 chr11:45825622- 32.10 48.98 0.61 1.12 5.00E−05 7.97E−04 solute carrier family35 (GDP- 45834567 fucose transporter), member C1 587 CCDC120 chrX:48910960- 5.26 8.03 0.61 0.92 2.15E−03 2.06E−02 coiled-coil domaincontaining 120 48927510 588 DHRS7 chr14: 60611499- 59.83 91.23 0.61 1.102.00E−04 2.80E−03 dehydrogenase/reductase (SDR 60632211 family) member 7589 HPSE chr4: 84213613- 4.04 6.16 0.61 0.94 9.50E−04 1.05E−02heparanase 84256306 590 ENPP4 chr6: 46097700- 9.04 13.78 0.61 1.155.00E−05 7.97E−04 ectonucleotide 46114436pyrophosphatase/phosphodiesterase 4 (putative) 591 MAL2 chr8: 120220609-153.69 234.21 0.61 1.01 3.50E−04 4.55E−03 mal, T-cell differentiationprotein 2 120257914 (gene/pseudogene) 592 MAP4K3 chr2: 39476406- 7.3411.19 0.61 1.06 3.50E−04 4.55E−03 mitogen-activated protein kinase39664453 kinase kinase kinase 3 593 IFI27L2 chr14: 94594117- 46.92 71.470.61 0.81 2.45E−03 2.29E−02 interferon, alpha-inducible protein 9459595727-like 2 594 MPG chr16: 127017- 28.23 42.96 0.61 0.83 4.35E−03 3.63E−02N-methylpurine DNA glycosylase 188697 595 SMAD3 chr15: 67358194- 14.1521.52 0.61 1.07 5.00E−05 7.97E−04 SMAD family member 3 67487533 596CCDC127 chr5: 204874- 6.07 9.22 0.60 0.75 6.15E−03 4.82E−02 coiled-coildomain containing 127 218297 597 ZNF37BP chr10: 43008960- 1.52 2.30 0.600.93 4.00E−04 5.06E−03 zinc finger protein 37B, 43048318 pseudogene 598THEM4 chr1: 151843342- 2.01 3.05 0.60 0.82 2.90E−03 2.62E−02thioesterase superfamily member 4 151882361 599 TM9SF3 chr10: 98277866-83.43 126.41 0.60 1.02 6.00E−04 7.22E−03 transmembrane 9 superfamily98346809 member 3 600 ARHGAP12 chr10: 32094325- 12.11 18.34 0.60 1.125.00E−05 7.97E−04 Rho GTPase activating protein 12 32217804 601 ZNF706chr8: 102209265- 29.25 44.30 0.60 1.21 1.50E−04 2.18E−03 zinc fingerprotein 706 102218292 602 PVRL2 chr19: 45349392- 35.79 54.20 0.60 1.085.00E−05 7.97E−04 poliovirus receptor-related 2 45392485 (herpesvirusentry mediator B) 603 RASSF6 chr4: 74437266- 10.28 15.55 0.60 1.102.00E−04 2.80E−03 Ras association (RalGDS/AF-6) 74486348 domain familymember 6 604 YAE1D1 chr7: 39605974- 14.13 21.37 0.60 0.88 2.05E−031.98E−02 Yae1 domain containing 1 39651688 605 ARL14 chr3: 160394947-37.06 56.02 0.60 1.08 2.00E−04 2.80E−03 ADP-ribosylation factor-like 14160396235 606 JMJD1C chr10: 64926980- 5.09 7.70 0.60 1.09 2.50E−043.38E−03 jumonji domain containing 1C 65226322 607 ZNF841 chr19:52567718- 4.11 6.21 0.59 0.91 9.00E−04 1.01E−02 zinc finger protein 84152599018 608 CAPRIN2 chr12: 30862485- 3.07 4.63 0.59 0.89 2.10E−032.02E−02 caprin family member 2 30907448 609 NMRK1 chr9: 77676115- 17.5926.54 0.59 0.95 5.00E−04 6.13E−03 nicotinamide riboside kinase 177703133 610 TMC6 chr17: 76108998- 23.85 35.96 0.59 0.88 3.25E−032.86E−02 transmembrane channel-like 6 76139049 611 DNAJC3 chr13:96329392- 20.29 30.58 0.59 1.08 1.50E−04 2.18E−03 DnaJ (Hsp40) homolog,subfamily 96447243 C, member 3 612 PRRG1 chrX: 37208527- 3.11 4.69 0.590.85 2.25E−03 2.14E−02 proline rich Gla 37316548 (G-carboxyglutamicacid) 1 613 SERPINB6 chr6: 2948392- 147.22 221.74 0.59 1.02 2.50E−043.38E−03 serpin peptidase inhibitor, 2972399 clade B (ovalbumin), member6 614 DST chr6: 56137688- 9.54 14.37 0.59 1.03 5.00E−04 6.13E−03dystonin 56819426 615 PYCR1 chr17: 79890261- 32.64 49.11 0.59 1.062.50E−04 3.38E−03 pyrroline-5-carboxylate 79895204 reductase 1 616OSBPL3 chr7: 24836155- 4.32 6.49 0.59 0.99 9.50E−04 1.05E−02 oxysterolbinding protein-like 3 25019831 617 CHMP4C chr8: 82644687- 14.92 22.420.59 1.00 6.00E−04 7.22E−03 charged multivesicular body 82671748 protein4C 618 CERS2 chr1: 150937648- 53.63 80.50 0.59 1.06 1.00E−04 1.51E−03ceramide synthase 2 150947479 619 TSPAN3 chr15: 77336359- 164.57 246.930.59 0.87 9.50E−04 1.05E−02 tetraspanin 3 77363570 620 KCTD21 chr11:77850838- 3.64 5.46 0.58 0.82 3.20E−03 2.82E−02 potassium channeltetramerization 77899664 domain containing 21 621 GUK1 chr1: 228327784-130.54 195.74 0.58 1.06 1.50E−04 2.18E−03 guanylate kinase 1 228336655622 MRPL17 chr11: 6701615- 9.26 13.88 0.58 0.95 3.00E−04 3.97E−03mitochondrial ribosomal protein 6704632 L17 623 PLEKHM1 chr17_ctg5_hap1:8.99 13.47 0.58 1.06 2.50E−04 3.38E−03 pleckstrin homology domain128327- containing, family M (with RUN 183214 domain) member 1 624 CD58chr1: 117057155- 20.54 30.79 0.58 0.97 7.50E−04 8.64E−03 CD58 molecule117113715 625 NDEL1 chr17: 8339169- 15.87 23.79 0.58 1.01 3.00E−043.97E−03 nudE neurodevelopment protein 1- 8371495 like 1 626 GSTP1chr11: 67351065- 304.06 455.64 0.58 1.03 3.00E−04 3.97E−03 glutathioneS-transferase pi 1 67354124 627 FEM1C chr5: 114856607- 11.81 17.70 0.581.10 5.00E−05 7.97E−04 fem-1 homolog c (C. elegans) 114880591 628MAP1LC3B chr16: 87425800- 29.29 43.86 0.58 1.08 1.00E−04 1.51E−03microtubule-associated protein 1 87438380 light chain 3 beta 629 PSENENchr19: 36236477- 50.58 75.70 0.58 0.95 6.50E−04 7.69E−03 presenilinenhancer gamma 36238056 secretase subunit 630 PTPRR chr12: 71031852-16.69 24.96 0.58 1.05 3.00E−04 3.97E−03 protein tyrosine phosphatase,71314584 receptor type, R 631 BFAR chr16: 14726667- 16.41 24.54 0.581.04 2.50E−04 3.38E−03 bifunctional apoptosis regulator 14763093 632GRB7 chr17: 37894161- 13.24 19.77 0.58 0.91 1.35E−03 1.41E−02 growthfactor receptor-bound 37903538 protein 7 633 BTNL8 chr5: 180326076-34.27 51.17 0.58 0.99 1.50E−04 2.18E−03 butyrophilin-like 8 180377906634 TWF2 chr3: 52262625- 32.82 48.99 0.58 1.03 2.00E−04 2.80E−03twinfilin actin binding protein 2 52273183 635 CD63 chr12: 56119226-804.12 1200.00 0.58 0.85 1.85E−03 1.82E−02 CD63 molecule 56123457 636RAP1B chr12: 69004618- 95.48 142.33 0.58 1.06 5.00E−05 7.97E−04 RAP1B,member of RAS oncogene 69054385 family 637 TMEM62 chr15: 43425721- 12.8419.14 0.58 1.00 5.00E−05 7.97E−04 transmembrane protein 62 43477341 638FURIN chr15: 91411821- 26.45 39.41 0.58 1.06 4.00E−04 5.06E−03 furin(paired basic amino acid 91426688 cleaving enzyme) 639 DIAPH1 chr5:140894587- 22.87 34.07 0.58 1.02 3.00E−04 3.97E−03 diaphanous-relatedformin 1 140998622 640 HID1 chr17: 72946838- 19.28 28.71 0.57 1.022.50E−04 3.38E−03 HID1 domain containing 72971823 641 ZDHHC9 chrX:128937263- 11.29 16.81 0.57 1.03 2.50E−04 3.38E−03 zinc finger,DHHC-type containing 128977910 9 642 PRKAG2 chr7: 151253200- 14.00 20.820.57 0.98 9.50E−04 1.05E−02 protein kinase, AMP-activated, 151576308gamma 2 non-catalytic subunit 643 PLD2 chr17: 4710395- 7.56 11.24 0.570.94 5.50E−04 6.69E−03 phospholipase D2 4726727 644 NOP10 chr15:34633916- 133.64 198.50 0.57 1.00 2.00E−04 2.80E−03 NOP 10ribonucleoprotein 34635362 645 LCOR chr10: 98592016- 8.56 12.72 0.571.04 4.00E−04 5.06E−03 ligand dependent nuclear receptor 98724198corepressor 646 JAK2 chr9: 4985244- 3.57 5.30 0.57 0.92 8.00E−049.10E−03 Janus kinase 2 5128183 647 ZNF468 chr19: 53341784- 4.75 7.040.57 0.91 1.75E−03 1.74E−02 zinc finger protein 468 53360902 648 DLL4chr15: 41221530- 4.73 7.02 0.57 0.82 2.40E−03 2.25E−02 delta-like 4(Drosophila) 41231258 649 TTC39A chr1: 51752929- 24.82 36.76 0.57 1.023.00E−04 3.97E−03 tetratricopeptide repeat domain 51810785 39A 650 MIA3chr1: 222791443- 9.37 13.86 0.56 1.07 2.00E−04 2.80E−03 melanomainhibitory activity 222841351 family, member 3 651 PSMA5 chr1:109941652- 25.78 38.12 0.56 1.15 5.00E−05 7.97E−04 proteasome subunitalpha 5 109969108 652 GRINA chr8: 145064225- 35.27 52.15 0.56 1.003.00E−04 3.97E−03 glutamate receptor, ionotropic, N- 145067583 methylD-aspartate-associated protein 1 (glutamate binding) 653 MVD chr16:88718347- 39.02 57.69 0.56 1.00 5.00E−04 6.13E−03 mevalonate (diphospho)88729495 decarboxylase 654 PIK3IP1 chr22: 31677578- 13.51 19.96 0.560.95 9.50E−04 1.05E−02 phosphoinositide-3-kinase 31688520 interactingprotein 1 655 MTHFD2 chr2: 74425689- 15.35 22.68 0.56 0.98 6.50E−047.69E−03 methylenetetrahydrofolate 74442424 dehydrogenase (NADP +dependent) 2, methenyltetrahydrofolate cyclohydrolase 656 RAP2B chr3:152880000- 3.97 5.87 0.56 1.01 2.50E−04 3.38E−03 RAP2B, member of RASoncogene 152888413 family 657 MORF4L2 chrX: 102930425- 68.30 100.82 0.561.03 8.50E−04 9.59E−03 mortality factor 4 like 2 102947484 658 CCDC64chr12: 120427647- 5.33 7.86 0.56 0.85 2.35E−03 2.22E−02 coiled-coildomain containing 64 120532299 659 HDAC9 chr7: 18126571- 2.54 3.74 0.560.77 6.45E−03 5.00E−02 histone deacetylase 9 19036992 660 HERC6 chr4:89299890- 3.60 5.30 0.56 0.85 2.50E−03 2.33E−02 HECT and RLD domaincontaining 89364249 E3 ubiquitin protein ligase family member 6 661FAM83H chr8: 144806102- 21.29 31.33 0.56 1.08 1.00E−04 1.51E−03 familywith sequence similarity 83, 144815914 member H 662 H2AFJ chr12:14927269- 14.87 21.86 0.56 0.98 4.50E−04 5.61E−03 H2A histone family,member J 14930936 663 CLOCK chr4: 56294067- 3.23 4.75 0.56 1.00 5.00E−046.13E−03 clock circadian regulator 56413076 664 YIPF3 chr6: 43479564-69.02 101.42 0.56 1.01 3.00E−04 3.97E−03 Yip1 domain family, member 343484728 665 SDCBP chr8: 59465727- 93.98 138.10 0.56 1.04 4.00E−045.06E−03 syndecan binding protein 59495419 (syntenin) 666 RABAC1 chr19:42460832- 107.53 157.92 0.55 0.98 2.50E−04 3.38E−03 Rab acceptor 1(prenylated) 42463528 667 MAGED2 chrX: 54834031- 25.06 36.80 0.55 0.991.00E−04 1.51E−03 melanoma antigen family D2 54842448 668 PTTG1IP chr21:46269499- 133.45 195.94 0.55 0.91 6.50E−04 7.69E−03 pituitarytumor-transforming 1 46293818 interacting protein 669 CIR1 chr2:175212877- 12.13 17.81 0.55 0.99 4.50E−04 5.61E−03 corepressorinteracting with RBPJ, 175260443 1 670 TBC1D1 chr4: 37892704- 10.3715.22 0.55 0.88 9.50E−04 1.05E−02 TBC1 (tre-2/USP6, BUB2, cdc16)38140796 domain family, member 1 671 YIPF4 chr2: 32502957- 26.02 38.180.55 1.03 2.50E−04 3.38E−03 Yip1 domain family, member 4 32531658 672RDH11 chr14: 68143518- 22.47 32.95 0.55 1.00 2.00E−04 2.80E−03 retinoldehydrogenase 11 (all- 68162510 trans/9-cis/11-cis) 673 SEC24A chr5:133984474- 10.28 15.07 0.55 0.97 4.50E−04 5.61E−03 SEC24 homolog A,COPII coat 134063601 complex component 674 VIMP chr15: 101811113- 92.90136.12 0.55 1.02 4.00E−04 5.06E−03 VCP-interacting membrane 101817725selenoprotein 675 CCNYL1 chr2: 208576263- 19.95 29.22 0.55 1.01 4.00E−045.06E−03 cyclin Y-like 1 208620896 676 SAT1 chrX: 23801274- 361.01528.26 0.55 0.93 1.20E−03 1.28E−02 spermidine/spermine N1- 23804327acetyltransferase 1 677 SLC29A1 chr6: 44187241- 8.91 13.04 0.55 0.823.90E−03 3.32E−02 solute carrier family 29 44201888 (equilibrativenucleoside transporter), member 1 678 C19orf33 chr19: 38794199- 356.92522.09 0.55 0.83 2.80E−03 2.55E−02 chromosome 19 open reading 38806606frame 33 679 VILL chr3: 38035077- 79.87 116.82 0.55 0.96 4.50E−045.61E−03 villin-like 38048676 680 RB1CC1 chr8: 53535017- 5.94 8.68 0.551.03 7.50E−04 8.64E−03 RB1-inducible coiled-coil 1 53627026 681 SLC1A1chr9: 4490426- 20.15 29.47 0.55 1.00 1.50E−04 2.18E−03 solute carrierfamily 1 4587469 (neuronal/epithelial high affinity glutamatetransporter, system Xag), member 1 682 PLEKHA1 chr10: 124134093- 16.4023.97 0.55 1.02 5.00E−04 6.13E−03 pleckstrin homology domain 124191871containing, family A (phosphoinositide binding specific) member 1 683GCC2 chr2: 1090655761 8.24 12.04 0.55 1.03 3.50E−04 4.55E−03 GRIP andcoiled-coil domain 109125854 containing 2 684 PTMS chr12: 6875540- 58.2385.01 0.55 0.99 3.50E−04 4.55E−03 parathymosin 6880118 685 NSDHL chrX:151999510- 16.25 23.71 0.55 0.87 1.45E−03 1.49E−02 NAD(P) dependentsteroid 152037907 dehydrogenase-like 686 DAP chr5: 10679341- 50.35 73.470.55 0.98 2.50E−04 3.38E−03 death-associated protein 10761387 687 STX7chr6: 132778662- 15.81 23.07 0.55 1.04 2.00E−04 2.80E−03 syntaxin 7132834337 688 AGAP3 chr7: 150782917- 13.79 20.11 0.55 0.93 1.75E−031.74E−02 ArfGAP with GTPase domain, 150841523 ankyrin repeat and PHdomain 3 689 LRRC42 chr1: 54411998- 9.72 14.18 0.54 0.83 2.10E−032.02E−02 leucine rich repeat containing 42 54433841 690 NMD3 chr3:160939098- 9.79 14.28 0.54 0.96 2.50E−04 3.38E−03 NMD3 ribosome exportadaptor 160969795 691 ARL8A chr1: 202102531- 12.72 18.55 0.54 0.842.80E−03 2.55E−02 ADP-ribosylation factor-like 8A 202113871 692 UBE2Hchr7: 129470572- 17.51 25.53 0.54 1.00 3.00E−04 3.97E−03ubiquitin-conjugating enzyme E2H 129592800 693 AGFG1 chr2: 228336847-8.08 11.78 0.54 1.06 4.50E−04 5.61E−03 ArfGAP with FG repeats 1228425938 694 SWAP70 chr11: 9685627- 8.38 12.20 0.54 0.95 8.50E−049.59E−03 SWAP switching B-cell complex 9774507 70 kDa subunit 695 AGAchr4: 178351928- 7.75 11.28 0.54 0.85 3.20E−03 2.82E−02aspartylglucosaminidase 178363657 696 SLK chr10: 105727469- 13.07 19.010.54 1.01 3.00E−04 3.97E−03 STE20-like kinase 105787342 697 USO1 chr4:76649705- 24.81 36.06 0.54 1.02 1.50E−04 2.18E−03 USO1 vesicle transportfactor 76735442 698 SLC6A6 chr3: 14444075- 3.28 4.77 0.54 0.76 3.40E−032.97E−02 solute carrier family 6 14581850 (neurotransmittertransporter), member 6 699 PDIA4 chr7: 148700153- 91.30 132.58 0.54 0.914.00E−04 5.06E−03 protein disulfide isomerase family 148725782 A, member4 700 TIPARP chr3: 156390959- 7.14 10.36 0.54 0.87 1.80E−03 1.78E−02TCDD-inducible poly(ADP-ribose) 156424557 polymerase 701 ODC1 chr2:10580496- 49.06 71.17 0.54 0.97 5.50E−04 6.69E−03 ornithinedecarboxylase 1 10588680 702 CLSTN3 chr12: 7282966- 5.67 8.22 0.54 0.852.15E−03 2.06E−02 calsyntenin 3 7311530 703 OSBPL5 chr11: 3108345- 11.3516.44 0.54 0.98 8.00E−04 9.10E−03 oxysterol binding protein-like 53186582 704 GSKIP chr14: 96829788- 73.03 105.83 0.54 1.01 7.50E−048.64E−03 GSK3B interacting protein 96853627 705 TMBIM1 chr2: 219135114-164.53 238.40 0.54 0.76 5.65E−03 4.49E−02 transmembrane BAX inhibitor219211516 motif containing 1 706 RAB22A chr20: 56884770- 7.20 10.43 0.531.01 2.50E−04 3.38E−03 RAB22A, member RAS oncogene 56942563 family 707PLOD1 chr1: 11994723- 18.69 27.05 0.53 0.95 2.00E−04 2.80E−03procollagen-lysine, 2-oxoglutarate 12035599 5-dioxygenase 1 708 CLTBchr5: 175819455- 177.54 256.76 0.53 0.94 1.15E−03 1.24E−02 clathrin,light chain B 175843570 709 VEZT chr12: 95611521- 9.90 14.30 0.53 0.997.50E−04 8.64E−03 vezatin, adherens junctions 95696566 transmembraneprotein 710 STAT1 chr2: 191833761- 26.71 38.59 0.53 0.92 1.40E−031.45E−02 signal transducer and activator of 191878976 transcription 1,91 kDa 711 TSSC1 chr2: 3192740- 11.64 16.81 0.53 0.83 2.80E−03 2.55E−02tumor suppressing subtransferable 3381653 candidate 1 712 SH3BP2 chr4:2794749- 12.60 18.19 0.53 0.94 8.00E−04 9.10E−03 SH3-domain bindingprotein 2 2842823 713 IDI1 chr10: 1064846- 60.42 87.18 0.53 0.901.75E−03 1.74E−02 isopentenyl-diphosphate delta 1095061 isomerase 1 714RNF8 chr6: 37321747- 2.24 3.23 0.53 0.77 5.55E−03 4.44E−02 ring fingerprotein 8, E3 ubiquitin 37362514 protein ligase 715 FAM114A1 chr4:38869353- 17.15 24.71 0.53 0.96 7.50E−04 8.64E−03 family with sequencesimilarity 38947365 114, member A1 716 RHOC chr1: 113243748- 337.22485.78 0.53 0.86 2.35E−03 2.22E−02 ras homolog family member C 113250025717 SREK1IP1 chr5: 63986134- 3.42 4.92 0.53 0.93 1.75E−03 1.74E−02SREK1-interacting protein 1 64064496 718 CTTN chr11: 70244611- 85.89123.71 0.53 0.91 1.40E−03 1.45E−02 cortactin 70282690 719 TXN chr9:113006091- 402.08 578.79 0.53 0.89 1.65E−03 1.66E−02 thioredoxin113018920 720 ITGB1 chr10: 33189245- 107.84 155.19 0.53 0.86 2.55E−032.37E−02 integrin, beta 1 (fibronectin 33247293 receptor, betapolypeptide, antigen CD29 includes MDF2, MSK12) 721 SLC35F5 chr2:114470368- 20.78 29.89 0.52 0.99 9.00E−04 1.01E−02 solute carrier family35, member 114514400 F5 722 CD68 chr17: 7482804- 92.32 132.65 0.52 0.926.50E−04 7.69E−03 CD68 molecule 7485429 723 SLC38A5 chrX: 48316919-12.07 17.33 0.52 0.81 3.50E−03 3.04E−02 solute carrier family 38, member5 48328644 724 GPR180 chr13: 95254103- 1.76 2.53 0.52 0.79 3.05E−032.72E−02 G protein-coupled receptor 180 95286899 725 PRKAB2 chr1:146626684- 9.96 14.29 0.52 0.95 6.50E−04 7.69E−03 protein kinase,AMP-activated, 146644168 beta 2 non-catalytic subunit 726 ATP6V1G1 chr9:117349993- 47.11 67.58 0.52 0.97 5.00E−04 6.13E−03 ATPase, H+transporting, 117361152 lysosomal 13 kDa, V1 subunit G1 727 FGL2 chr7:76751933- 22.62 32.43 0.52 0.94 1.80E−03 1.78E−02 fibrinogen-like 276924521 728 POLD4 chr11: 67085309- 103.01 147.68 0.52 0.89 1.95E−031.91E−02 polymerase (DNA-directed), delta 67159158 4, accessory subunit729 MLKL chr16: 74705752- 9.50 13.62 0.52 0.86 2.10E−03 2.02E−02 mixedlineage kinase domain-like 74734789 730 TRAM1 chr8: 71485452- 60.5086.70 0.52 0.97 9.00E−04 1.01E−02 translocation associated 71520694membrane protein 1 731 ERI1 chr8: 8860313- 3.47 4.97 0.52 0.80 4.55E−033.75E−02 exoribonuclease 1 8890849 732 PLAC8 chr4: 84011210- 321.83460.37 0.52 0.87 2.45E−03 2.29E−02 placenta-specific 8 84035911 733C14orf1 chr14: 76117232- 32.02 45.78 0.52 0.88 8.00E−04 9.10E−03chromosome 14 open reading 76127538 frame 1 734 LPIN2 chr18: 2916991-14.18 20.27 0.52 0.93 5.00E−05 7.97E−04 lipin 2 3011945 735 POMP chr13:29233140- 68.08 97.26 0.51 1.02 5.00E−04 6.13E−03 proteasome maturationprotein 29253093 736 PLA2G4F chr15: 42433331- 12.51 17.87 0.51 0.926.50E−04 7.69E−03 phospholipase A2, group IVF 42448839 737 SDC4 chr20:43953928- 51.59 73.67 0.51 0.95 2.00E−04 2.80E−03 syndecan 4 43977064738 BTF3 chr5: 72794249- 310.90 443.90 0.51 0.90 1.35E−03 1.41E−02 basictranscription factor 3 72801448 739 GBA chr1: 155204238- 25.44 36.330.51 0.90 1.00E−03 1.10E−02 glucosidase, beta, acid 155214653 740 OSTCchr4: 109571740- 88.50 126.30 0.51 0.98 5.50E−04 6.69E−03oligosaccharyltransferase complex 109588978 subunit (non-catalytic) 741TAX1BP1 chr7: 27778991- 37.91 54.10 0.51 0.99 7.00E−04 8.19E−03 Taxi(human T-cell leukemia virus 27869386 type 1) binding protein 1 742ARHGAP5 chr14: 32546494- 9.74 13.90 0.51 0.98 1.15E−03 1.24E−02 RhoGTPase activating protein 5 32628934 743 TMEM173 chr5: 138855112- 19.2327.43 0.51 0.88 1.00E−03 1.10E−02 transmembrane protein 173 138862343744 NFKB2 chr10: 104153866- 11.74 16.74 0.51 0.88 1.95E−03 1.91E−02nuclear factor of kappa light 104162286 polypeptide gene enhancer in B-cells 2 (p49/p100) 745 FRMD8 chr11: 65154040- 15.26 21.75 0.51 0.901.15E−03 1.24E−02 FERM domain containing 8 65180995 746 JAGN1 chr3:9932270- 25.51 36.33 0.51 0.88 1.20E−03 1.28E−02 jagunal homolog 19936031 747 PLEK2 chr14: 67853699- 37.15 52.88 0.51 0.91 6.00E−047.22E−03 pleckstrin 2 67878828 748 ERLEC1 chr2: 53897116- 29.35 41.740.51 0.86 2.05E−03 1.98E−02 endoplasmic reticulum lectin 1 54087170 749COPB1 chr11: 14479048- 44.05 62.65 0.51 0.95 1.05E−03 1.15E−02 coatomerprotein complex, subunit 14521441 beta 1 750 SBNO2 chr19: 1107632- 17.3924.72 0.51 0.97 8.00E−04 9.10E−03 strawberry notch homolog 2 1174282(Drosophila) 751 PSMB9 chr6_ssto_hap7: 30.29 43.06 0.51 0.79 2.45E−032.29E−02 proteasome subunit beta 9 4252710- 4258398 752 RALGPS2 chr1:178694281- 8.34 11.85 0.51 0.95 1.15E−03 1.24E−02 Rai GEF with PH domainand SH3 178890977 binding motif 2 753 NAPG chr18: 10525872- 13.16 18.690.51 0.93 1.10E−03 1.20E−02 N-ethylmaleimide-sensitive factor 10552766attachment protein, gamma 754 WHAMM chr15: 83478379- 5.58 7.92 0.51 0.812.60E−03 2.41E−02 WAS protein homolog associated 83503613 with actin,golgi membranes and microtubules 755 FAM109A chr12: 111798454- 12.9718.42 0.51 0.89 2.05E−03 1.98E−02 family with sequence similarity111806925 109, member A 756 IL17RE chr3: 9944295- 31.04 44.08 0.51 0.901.00E−03 1.10E−02 interleukin 17 receptor E 9958084 757 MYO7B chr2:128293377- 44.94 63.78 0.51 0.87 1.45E−03 1.49E−02 myosin VIIB 128395303758 10-Sep chr2: 110300373- 19.30 27.38 0.50 0.94 7.00E−04 8.19E−03septin 10 110371783 759 TMEM106B chr7: 12250847- 11.77 16.71 0.50 1.001.50E−03 1.53E−02 transmembrane protein 106B 12276890 760 EFNA2 chr19:1286152- 18.89 26.79 0.50 0.82 3.00E−03 2.69E−02 ephrin-A2 1301429 761CHIC1 chrX: 72782983- 2.18 3.08 0.50 0.79 5.60E−03 4.46E−02cysteine-rich hydrophobic domain 72906937 1 762 EHBP1L1 chr11: 65343508-27.59 39.07 0.50 0.95 7.00E−04 8.19E−03 EH domain binding protein 1-like65360116 1 763 SERPING1 chr11: 57365026- 19.79 28.01 0.50 0.84 2.35E−032.22E−02 serpin peptidase inhibitor, clade 57382326 G (C1 inhibitor),member 1 764 LMAN1 chr18: 56995055- 23.57 33.34 0.50 0.95 7.00E−048.19E−03 lectin, mannose-binding, 1 57026508 765 CXCL12 chr10: 44865600-25.41 17.95 −0.50 −0.74 3.50E−03 3.04E−02 chemokine (C-X-C motif) ligand44880545 12 766 SLC9A2 chr2: 103236165- 22.28 15.73 −0.50 −0.89 1.35E−031.41E−02 solute carrier family 9, subfamily 103327809 A (NHE2, cationproton antiporter 2), member 2 767 SFXN1 chr5: 174905513- 38.66 27.29−0.50 −0.92 7.50E−04 8.64E−03 sideroflexin 1 174955621 768 GRAMD3 chr5:125695787- 30.37 21.41 −0.50 −0.93 7.00E−04 8.19E−03 GRAM domaincontaining 3 125829853 769 HMOX1 chr22: 35777059- 62.11 43.78 −0.50−0.86 1.35E−03 1.41E−02 heme oxygenase 1 35790207 770 HNF1B chr17:36046433- 16.35 11.52 −0.51 −0.84 2.55E−03 2.37E−02 HNF1 homeobox B36105096 771 UQCRFS1 chr19: 29698166- 243.33 171.33 −0.51 −0.88 6.00E−047.22E−03 ubiquinol-cytochrome c reductase, 29704136 Rieske iron-sulfurpolypeptide 1 772 G0LGA2P5 chr12: 100550174- 12.14 8.55 −0.51 −0.793.60E−03 3.10E−02 golgin A2 pseudogene 5 100567121 773 ABCA1 chr9:107543283- 4.46 3.14 −0.51 −0.83 2.65E−03 2.44E−02 ATP-binding cassette,sub-family 107690527 A (ABC1), member 1 774 PPTC7 chr12: 110972236-13.84 9.73 −0.51 −0.84 2.15E−03 2.06E−02 PTC7 protein phosphatase111021064 homolog 775 NDUFV1 chr11: 67374322- 164.54 115.65 −0.51 −0.833.40E−03 2.97E−02 NADH dehydrogenase 67380012 (ubiquinone) flavoprotein1, 51 kDa 776 PWWP2A chr5: 159502891- 14.18 9.96 −0.51 −0.88 1.75E−031.74E−02 PWWP domain containing 2A 159546452 777 PAPD5 chr16: 50186828-3.62 2.53 −0.51 −0.83 3.90E−03 3.32E−02 PAP associated domain containing50269219 5 778 CECR1 chr22: 17659679- 15.97 11.18 −0.51 −0.86 1.75E−031.74E−02 cat eye syndrome chromosome 17702744 region, candidate 1 779FLVCR2 chr14: 76041246- 12.44 8.68 −0.52 −0.82 2.95E−03 2.66E−02 felineleukemia virus subgroup C 76114512 cellular receptor family, member 2780 CC2D1A chr19: 14016955- 39.58 27.56 −0.52 −0.97 5.50E−04 6.69E−03coiled-coil and C2 domain 14041693 containing 1A 781 DDC chr7: 50526133-48.89 34.04 −0.52 −0.92 6.50E−04 7.69E−03 dopa decarboxylase (aromaticL- 50633154 amino acid decarboxylase) 782 CPOX chr3: 98298289- 12.798.90 −0.52 −0.85 1.55E−03 1.57E−02 coproporphyrinogen oxidase 98312455783 ABAT chr16: 8768443- 10.03 6.97 −0.52 −0.89 1.40E−03 1.45E−024-aminobutyrate aminotransferase 8878432 784 MYBL2 chr20: 42295658- 9.166.36 −0.53 −0.77 4.95E−03 4.02E−02 v-myb avian myeloblastosis viral42345136 oncogene homolog-like 2 785 TRPM4 chr19: 49661015- 109.31 75.95−0.53 −0.93 1.40E−03 1.45E−02 transient receptor potential cation49715098 channel, subfamily M, member 4 786 GAB2 chr11: 77926335- 7.495.20 −0.53 −0.93 9.50E−04 1.05E−02 GRB2-associated binding protein 278128868 787 RRM2B chr8: 103216728- 8.75 6.07 −0.53 −0.91 2.50E−032.33E−02 ribonucleotide reductase M2 B 103251346 (TP53 inducible) 788LYRM7 chr5: 130506640- 5.03 3.49 −0.53 −0.88 2.10E−03 2.02E−02 LYR motifcontaining 7 130541119 789 ABO chr9: 136130562- 37.30 25.84 −0.53 −0.948.00E−04 9.10E−03 ABO blood group (transferase A, 136150630 alpha 1-3-N-acetylgalactosaminyltransferase; transferase B, alpha 1-3-galactosyltransferase) 790 ACOX1 chr17: 73937588- 47.96 33.17 −0.53−0.83 2.60E−03 2.41E−02 acyl-CoA oxidase 1, palmitoyl 74002080 791 CAAP1chr9: 26840682- 19.24 13.28 −0.53 −0.96 1.20E−03 1.28E−02 caspaseactivity and apoptosis 26892826 inhibitor 1 792 MAMDC4 chr9: 139746818-12.27 8.45 −0.54 −0.90 8.50E−04 9.59E−03 MAM domain containing 4139755251 793 FGFR3 chr4: 1795038- 24.26 16.72 −0.54 −1.00 3.00E−043.97E−03 fibroblast growth factor 1810599 receptor 3 794 ALDH1B1 chr9:38392660- 25.85 17.81 −0.54 −0.93 9.00E−04 1.01E−02 aldehydedehydrogenase 1 family, 38398662 member B1 795 DPYD chr1: 97543299- 4.342.99 −0.54 −0.75 5.20E−03 4.20E−02 dihydropyrimidine dehydrogenase98386615 796 SNX30 chr9: 115513133- 10.39 7.15 −0.54 −0.95 3.50E−044.55E−03 sorting nexin family member 30 115637267 797 ACSF3 chr16:89160216- 10.15 6.98 −0.54 −0.85 2.55E−03 2.37E−02 acyl-CoA synthetasefamily 89222254 member 3 798 SGK2 chr20: 42187634- 66.68 45.87 −0.54−0.97 4.00E−04 5.06E−03 serum/glucocorticoid regulated 42214273 kinase 2799 KDM4A chr1: 44115796- 20.94 14.39 −0.54 −0.97 5.00E−04 6.13E−03lysine (K)-specific demethylase 44173012 4A 800 SLC17A4 chr6: 25754926-41.09 28.22 −0.54 −0.98 3.50E−04 4.55E−03 solute carrier family 17,25781403 member 4 801 SEC31B chr10: 102246402- 7.55 5.18 −0.54 −0.891.10E−03 1.20E−02 SEC31 homolog B, COPII coat 102279595 complexcomponent 802 SEPHS2 chr16: 30454945- 113.79 78.03 −0.54 −0.98 1.50E−042.18E−03 selenophosphate synthetase 2 30457296 803 LPCAT3 chr12:7085346- 71.15 48.77 −0.54 −0.99 3.00E−04 3.97E−03lysophosphatidylcholine 7125842 acyltransferase 3 804 DEPDC5 chr22:32149936- 6.48 4.43 −0.55 −0.88 1.55E−03 1.57E−02 DEP domain containing5 32303020 805 PDK4 chr7: 95212808- 23.85 16.30 −0.55 −0.97 4.00E−045.06E−03 pyruvate dehydrogenase kinase, 95225925 isozyme 4 806 MESTchr7: 130126015- 23.02 15.73 −0.55 −0.82 4.15E−03 3.50E−02 mesodermspecific transcript 130371406 807 ZNF704 chr8: 81540685- 3.21 2.19 −0.55−0.95 4.00E−04 5.06E−03 zinc finger protein 704 81787016 808 ZNF462chr9: 109625377- 2.19 1.49 −0.55 −0.80 3.00E−03 2.69E−02 zinc fingerprotein 462 109848716 809 SGPP1 chr14: 64150934- 15.01 10.22 −0.55 −0.954.50E−04 5.61E−03 sphingosine-1 -phosphate 64194756 phosphatase 1 810COL14A1 chr8: 121137346- 6.24 4.24 −0.56 −0.94 6.00E−04 7.22E−03collagen, type XIV, alpha 1 121384273 811 IGSF9 chr1: 159896828- 34.1423.21 −0.56 −1.03 4.00E−04 5.06E−03 immunoglobulin superfamily,159915386 member 9 812 NIPSNAP3A chr9: 107509968- 41.39 28.12 −0.56−0.99 4.50E−04 5.61E−03 nipsnap homolog 3A 107522403 (C. elegans) 813FN3K chr17: 80693451- 20.04 13.61 −0.56 −0.79 4.85E−03 3.96E−02fructosamine 3 kinase 80709073 814 TRIM24 chr7: 138145078- 8.23 5.59−0.56 −0.92 9.50E−04 1.05E−02 tripartite motif containing 24 138270332815 SNHG18 chr5: 9546311- 19.52 13.23 −0.56 −0.83 2.30E−03 2.18E−02small nucleolar RNA host gene 18 9550409 816 HOXA3 chr7: 27145808- 8.585.81 −0.56 −0.83 3.05E−03 2.72E−02 homeobox A3 27166639 817 TLE3 chr15:70340129- 14.77 9.99 −0.56 −0.97 8.50E−04 9.59E−03 transducin-likeenhancer of 70390256 split 3 818 ADH6 chr4: 100010007- 13.20 8.93 −0.56−0.83 5.10E−03 4.13E−02 alcohol dehydrogenase 6 (class V) 100222513 819PLCD1 chr3: 38048986- 33.01 22.31 −0.57 −0.94 1.45E−03 1.49E−02phospholipase C, delta 1 38071154 820 PAPSS2 chr10: 89419475- 132.9589.82 −0.57 −0.93 4.00E−04 5.06E−03 3′-phosphoadenosine 5′- 89507462phosphosulfate synthase 2 821 LRRC19 chr9: 26903367- 57.59 38.91 −0.57−0.91 2.35E−03 2.22E−02 leucine rich repeat containing 27062931 19 822MAGI1 chr3: 65339905- 7.17 4.84 −0.57 −0.98 7.50E−04 8.64E−03 membraneassociated guanylate 66024509 kinase, WW and PDZ domain containing 1 823DNAH1 chr3: 52350334- 3.18 2.14 −0.57 −0.99 3.00E−04 3.97E−03 dynein,axonemal, heavy chain 1 52434513 824 ARHGAP33 chr19: 36266416- 5.23 3.51−0.57 −0.82 5.10E−03 4.13E−02 Rho GTPase activating protein 33 36279724825 PRR5L chr11: 36317724- 30.51 20.49 −0.57 −1.02 4.00E−04 5.06E−03proline rich 5 like 36486754 826 P2RY1 chr3: 152552735- 7.86 5.28 −0.57−0.82 1.80E−03 1.78E−02 purinergic receptor P2Y, 152555843 G-proteincoupled, 1 827 MAVS chr20: 3827445- 34.18 22.95 −0.57 −1.00 4.50E−045.61E−03 mitochondrial antiviral signaling 3856770 protein 828 MIR600HGchr9: 125871772- 4.72 3.17 −0.57 −0.89 1.30E−03 1.37E−02 MIR600 hostgene 125877756 829 TPRN chr9: 140086068- 68.40 45.91 −0.58 −1.001.10E−03 1.20E−02 taperin 140095163 830 NXPE4 chr11: 114441312- 84.2356.49 −0.58 −1.02 6.50E−04 7.69E−03 neurexophilin and PC-esterase114466484 domain family, member 4 831 LETM1 chr4: 1813205- 39.16 26.20−0.58 −1.06 1.50E−04 2.18E−03 leucine zipper-EF-hand containing 1857974transmembrane protein 1 832 CBFA2T3 chr16: 88941262- 4.20 2.81 −0.58−0.77 6.45E−03 5.00E−02 core-binding factor, runt domain, 89043504 alphasubunit 2; translocated to, 3 833 GPR160 chr3: 169755734- 74.24 49.55−0.58 −1.08 5.00E−05 7.97E−04 G protein-coupled receptor 160 169803183834 SCO1 chr17: 10583648- 36.47 24.32 −0.58 −1.04 1.00E−04 1.51E−03 SCO1cytochrome c oxidase 10600885 assembly protein 835 ENGASE chr17:77071018- 34.01 22.67 −0.59 −1.07 5.00E−05 7.97E−04 endo-beta-N-77084685 acetylglucosaminidase 836 PDXP chr22: 38054736- 26.96 17.93−0.59 −0.97 4.50E−04 5.61E−03 pyridoxal (pyridoxine, vitamin 38062939B6) phosphatase 837 BDH1 chr3: 197236653- 59.31 39.30 −0.59 −1.032.00E−04 2.80E−03 3-hydroxybutyrate dehydrogenase, 197300194 type 1 838TFRC chr3: 195776154- 79.84 52.86 −0.60 −0.89 1.75E−03 1.74E−02transferrin receptor 195809032 839 PDK2 chr17: 48172100- 35.38 23.41−0.60 −0.95 1.05E−03 1.15E−02 pyruvate dehydrogenase kinase, 48207246isozyme 2 840 GNA11 chr19: 3094407- 116.16 76.65 −0.60 −1.07 1.50E−042.18E−03 guanine nucleotide binding protein 3124000 (G protein), alpha11 (Gq class) 841 GOLGA8A chr15: 34671269- 6.89 4.54 −0.60 −0.937.00E−04 8.19E−03 golgin A8 family, member A 34729667 842 KIFC2 chr8:145691737- 30.52 20.11 −0.60 −0.97 1.05E−03 1.15E−02 kinesin familymember C2 145701718 843 C15orf52 chr15: 40623652- 14.06 9.23 −0.61 −1.093.00E−04 3.97E−03 chromosome 15 open reading 40633168 frame 52 844 CDH24chr14: 23516269- 4.16 2.72 −0.61 −0.80 6.00E−03 4.72E−02 cadherin 24,type 2 23526747 845 CPA3 chr3: 148583042- 25.76 16.81 −0.62 −0.793.65E−03 3.14E−02 carboxypeptidase A3 (mast cell) 148614872 846 LOXchr5: 121398889- 2.94 1.92 −0.62 −0.83 4.50E−03 3.72E−02 lysyl oxidase121414055 847 ZDHHC2 chr8: 17013835- 10.15 6.62 −0.62 −1.03 1.50E−042.18E−03 zinc finger, DHHC-type containing 17080241 2 848 FOXD2-AS1chr1: 47897806- 10.58 6.90 −0.62 −0.86 1.95E−03 1.91E−02 FOXD2 antisenseRNA 1 (head to 47900313 head) 849 MYO1D chr17: 30819627- 163.85 106.54−0.62 −0.93 3.50E−04 4.55E−03 myosin ID 31203902 850 CLUH chr17:2592679- 45.24 29.39 −0.62 −1.17 5.00E−05 7.97E−04 clusteredmitochondria 2614927 (cluA/CLU1) homolog 851 ACADS chr12: 121163570-132.93 86.33 −0.62 −1.02 1.00E−04 1.51E−03 acyl-CoA dehydrogenase, C-2to 121177811 C-3 short chain 852 BCL2 chr18: 60790578- 4.52 2.93 −0.63−0.83 4.50E−04 5.61E−03 B-cell CLU/lymphoma 2 60986613 853 B3GNT6 chr11:76745384- 35.42 22.93 −0.63 −1.16 1.00E−04 1.51E−03 UDP-GlcNAc: betaGalbeta-1,3-N- 767530054- acetylglucosaminyltransferase 6 854 ZNF764 chr16:30565084- 6.17 3.99 −0.63 −0.86 2.25E−03 2.14E−02 zinc finger protein764 30569642 855 ACAT1 chr11: 107992257- 65.08 42.01 −0.63 −1.145.00E−05 7.97E−04 acetyl-CoA acetyltransferase 1 108018891 856 TMEM8Bchr9: 35829221- 8.77 5.66 −0.63 −0.91 2.45E−03 2.29E−02 transmembraneprotein 8B 35854844 857 GADD45B chr19: 2476122- 14.41 9.29 −0.63 −0.843.75E−03 3.22E−02 growth arrest and DNA-damage- 2478257 inducible, beta858 NRARP chr9: 140194082- 88.57 57.08 −0.63 −1.15 5.00E−05 7.97E−04NOTCH-regulated ankyrin repeat 140196703 protein 859 RCN3 chr19:50030874- 34.55 22.26 −0.63 −0.94 1.10E−03 1.20E−02 reticulocalbin 3,EF-hand calcium 50046890 binding domain 860 NHSL1 chr6: 138743180- 21.0913.58 −0.64 −1.14 1.00E−04 1.51E−03 NHS-like 1 138893668 861 LAPTM4Bchr8: 98787808- 31.25 20.09 −0.64 −1.11 5.00E−05 7.97E−04 lysosomalprotein transmembrane 98864830 4 beta 862 KCNK10 chr14: 88646451- 1.200.77 −0.64 −0.75 5.90E−03 4.65E−02 potassium channel, two pore 88793256domain subfamily K, member 10 863 NR6A1 chr9: 127279553- 2.46 1.58 −0.64−0.87 1.45E−03 1.49E−02 nuclear receptor subfamily 6, 127533589 group A,member 1 864 AHCYL2 chr7: 128864854- 141.01 90.52 −0.64 −0.96 6.50E−047.69E−03 adenosylhomocysteinase-like 2 129070052 865 GLIPR2 chr9:36136532- 19.21 12.33 −0.64 −0.95 7.00E−04 8.19E−03 GLIpathogenesis-related 2 36163910 866 DMD chrX: 31137344- 2.86 1.83 −0.64−0.81 5.30E−03 4.27E−02 dystrophin 33357726 867 PKIG chr20: 43160421-40.24 25.79 −0.64 −1.04 2.50E−04 3.38E−03 protein kinase(cAMP-dependent, 43247678 catalytic) inhibitor gamma 868 GCSHP3 chr2:206980296- 16.68 10.69 −0.64 −0.85 2.70E−03 2.48E−02 glycine cleavagesystem protein H 206981296 (aminomethyl carrier) pseudogene 3 869 E2F8chr11: 19245609- 4.10 2.63 −0.64 −0.87 1.60E−03 1.62E−02 E2Ftranscription factor 8 19263202 870 SCARA5 chr8: 27727398- 14.59 9.33−0.64 −1.11 1.00E−04 1.51E−03 scavenger receptor class A, 27850369member 5 871 MAP2K6 chr17: 67410837- 16.77 10.72 −0.65 −1.02 2.50E−043.38E−03 mitogen-activated protein kinase 67538470 kinase 6 872 ARHGEF9chrX: 62854847- 8.54 5.45 −0.65 −1.08 2.00E−04 2.80E−03 Cdc42 guaninenucleotide 63005426 exchange factor (GEF) 9 873 SSTR1 chr14: 38677203-4.43 2.82 −0.65 −0.84 3.00E−03 2.69E−02 somatostatin receptor 1 38682268874 FAM43A chr3: 194406621- 8.47 5.39 −0.65 −0.93 1.55E−03 1.57E−02family with sequence similarity 194409766 43, member A 875 BRINP3 chr1:190066796- 3.28 2.08 −0.65 −0.75 3.70E−03 3.18E−02 bone morphogenetic190446759 protein/retinoic acid inducible neural-specific 3 876 PLCG2chr16: 81812862- 5.66 3.59 −0.66 −1.05 5.00E−05 7.97E−04 phospholipaseC, gamma 2 81996298 (phosphatidylinositol-specific) 877 FABP5 chr8:82192717- 255.22 161.88 −0.66 −1.14 5.00E−05 7.97E−04 fatty acid bindingprotein 5 82197012 (psoriasis-associated) 878 TTC30A chr2: 178479025-2.18 1.38 −0.66 −0.81 5.30E−03 4.27E−02 tetratricopeptide repeat domain178483694 30A 879 1-Mar chr1: 220960038- 18.60 11.74 −0.66 −1.062.00E−04 2.80E−03 mitochondrial amidoxime reducing 220987741 component 1880 ME2 chr18: 48405431- 31.18 19.68 −0.66 −1.22 5.00E−05 7.97E−04 malicenzyme 2, NAD(+)- 48476162 dependent, mitochondrial 881 MEGF8 chr19:42829760- 5.71 3.60 −0.66 −1.21 1.00E−04 1.51E−03 multipleEGF-like-domains 8 42882921 882 FRRS1 chr1: 100111430- 5.79 3.65 −0.66−0.81 5.70E−03 4.52E−02 ferric-chelate reductase 1 100231349 883 SFXN5chr2: 73169164- 11.09 6.99 −0.67 −1.10 5.00E−05 7.97E−04 sideroflexin 573298965 884 LINC01004 chr7: 104622193- 6.51 4.10 −0.67 −1.20 5.00E−057.97E−04 long intergenic non-protein coding 104631612 RNA 1004 885 GIPC2chr1: 78511588- 19.87 12.52 −0.67 −1.19 5.00E−05 7.97E−04 GIPC PDZdomain containing 78603112 family, member 2 886 ALDH5A1 chr6: 24495196-8.33 5.24 −0.67 −1.09 1.50E−04 2.18E−03 aldehyde dehydrogenase 5 family,24537435 member A1 887 PTGDR chr14: 52734430- 7.62 4.79 −0.67 −0.951.40E−03 1.45E−02 prostaglandin D2 receptor (DP) 52743442 888 PDE8Achr15: 85523743- 39.74 24.98 −0.67 −1.21 5.00E−05 7.97E−04phosphodiesterase 8A 85682376 889 PIGZ chr3: 196673214- 54.26 34.10−0.67 −1.12 1.00E−04 1.51E−03 phosphatidylinositol glycan anchor196695742 biosynthesis, class Z 890 ENTPD5 chr14: 74433180- 73.01 45.71−0.68 −1.17 5.00E−05 7.97E−04 ectonucleoside triphosphate 74486026diphosphohydrolase 5 891 KREMEN1 chr22: 29469065- 13.91 8.70 −0.68 −1.195.00E−05 7.97E−04 kringle containing transmembrane 29564321 protein 1892 PGAP3 chr17: 37827374- 38.45 23.97 −0.68 −1.16 1.00E−04 1.51E−03post-GPI attachment to proteins 3 37844323 893 NRG1 chr8: 31497267- 3.111.94 −0.68 −0.73 6.25E−03 4.88E−02 neuregulin 1 32622558 894 HADH chr4:108910869- 116.76 72.02 −0.70 −1.22 5.00E−05 7.97E−04 hydroxyacyl-CoAdehydrogenase 108956331 895 ARHGEF37 chr5: 148961134- 10.02 6.18 −0.70−1.20 5.00E−05 7.97E−04 Rho guanine nucleotide exchange 149014527 factor(GEF) 37 896 PBX1 chr1: 164528596- 9.57 5.90 −0.70 −1.13 5.00E−057.97E−04 pre-B-cell leukemia homeobox 1 164821060 897 MAOA chrX:43514154- 103.83 63.98 −0.70 −1.11 5.00E−05 7.97E−04 monoamine oxidase A43606071 898 CAMK1D chr10: 12875132- 11.95 7.34 −0.70 −1.01 5.00E−046.13E−03 12877545 899 BAHCC1 chr17: 79373520- 2.84 1.75 −0.70 −1.112.00E−04 2.80E−03 BAH domain and coiled-coil 79433358 containing 1 900MAN1A1 chr6: 119498365- 36.51 22.41 −0.70 −1.24 5.00E−05 7.97E−04mannosidase, alpha, class 1A, 119670931 member 1 901 KIT chr4: 55524094-6.51 3.99 −0.70 −1.05 2.00E−04 2.80E−03 v-kit Hardy-Zuckerman 4 feline55606881 sarcoma viral oncogene homolog 902 MEIS3P1 chr17: 15690163-4.60 2.82 −0.71 −0.91 1.05E−03 1.15E−02 Meis homeobox 3 pseudogene 115693019 903 HAPLN1 chr5: 82934016- 4.46 2.72 −0.71 −1.06 4.50E−045.61E−03 hyaluronan and proteoglycan link 83016896 protein 1 904 SDR42E1chr16: 82031250- 5.65 3.45 −0.71 −0.92 1.20E−03 1.28E−02 short chain82045093 dehydrogenase/reductase family 42E, member 1 905 WNK2 chr9:95947211- 13.74 8.38 −0.71 −1.26 5.00E−05 7.97E−04 WNK lysine deficientprotein 96108696 kinase 2 906 PLOD2 chr3: 145787227- 41.40 25.25 −0.71−1.33 1.00E−04 1.51E−03 procollagen-lysine, 2-oxoglutarate 1458792825-dioxygenase 2 907 IL6R chr1: 154377668- 20.00 12.20 −0.71 −1.275.00E−05 7.97E−04 interleukin 6 receptor 154441926 908 PCSK5 chr9:78505559- 10.92 6.66 −0.71 −1.25 5.00E−05 7.97E−04 proprotein convertase78977255 subtilisin/kexin type 5 909 TMEM209 chr7: 129804554- 9.41 5.73−0.71 −1.16 5.00E−05 7.97E−04 transmembrane protein 209 129845338 910MOGAT2 chr11: 75428933- 48.08 29.26 −0.72 −1.17 5.00E−05 7.97E−04monoacylglycerol O- 75442331 acyltransferase 2 911 SLC4A7 chr3:27414211- 4.69 2.85 −0.72 −1.16 5.00E−05 7.97E−04 solute carrier family4, sodium 27525911 bicarbonate cotransporter, member 7 912 ZNF132 chr19:58944180- 1.44 0.87 −0.72 −0.71 6.40E−03 4.97E−02 zinc finger protein132 58951589 913 C7orf31 chr7: 25174315- 4.99 3.03 −0.72 −1.03 1.50E−042.18E−03 chromosome 7 open reading frame 25219817 31 914 ZBTB10 chr8:81397853- 3.27 1.98 −0.72 −1.13 4.00E−04 5.06E−03 zinc finger and BTBdomain 81438500 containing 10 915 FLJ22763 chr3: 108855560- 8.70 5.26−0.73 −0.95 2.55E−03 2.37E−02 uncharacterized LOC401081 108868951 916SCAP chr3: 47455183- 43.81 26.48 −0.73 −1.35 5.00E−05 7.97E−04 SREBFchaperone 47517445 917 MTSS1 chr8: 125563010- 8.18 4.95 −0.73 −1.195.00E−05 7.97E−04 metastasis suppressor 1 125740748 918 CES3 chr16:66995131- 53.33 31.98 −0.74 −1.27 5.00E−05 7.97E−04 carboxylesterase 367009052 919 ACACB chr12: 109577201- 7.25 4.33 −0.74 −1.30 5.00E−057.97E−04 acetyl-CoA carboxylase beta 109706030 920 ZNF813 chr19:53970988- 1.20 0.72 −0.75 −0.75 5.15E−03 4.16E−02 zinc finger protein813 53997546 921 CLDN15 chr7: 100875372- 26.77 15.95 −0.75 −1.135.00E−05 7.97E−04 claudin 15 100882101 922 DLL1 chr6: 170591293- 9.555.67 −0.75 −1.15 5.00E−05 7.97E−04 delta-like 1 (Drosophila) 170599697923 NCAM1 chr11: 112830002- 1.83 1.08 −0.75 −0.91 1.25E−03 1.33E−02neural cell adhesion molecule 1 113149158 924 LRP12 chr8: 105501458-3.70 2.19 −0.76 −1.00 6.00E−04 7.22E−03 low density lipoproteinreceptor- 105601252 related protein 12 925 ATOH1 chr4: 94750077- 21.8312.89 −0.76 −1.07 4.00E−04 5.06E−03 atonal bHLH transcription factor 194751142 926 FOXD2 chr1: 47901688- 10.10 5.95 −0.76 −1.25 5.00E−057.97E−04 forkhead box D2 47906363 927 ID3 chr1: 23884420- 105.85 62.35−0.76 −1.27 5.00E−05 7.97E−04 inhibitor of DNA binding 3, 23886285dominant negative helix-loop-helix protein 928 SLC35G1 chr10: 95653729-10.01 5.90 −0.76 −1.04 2.00E−04 2.80E−03 solute carrier family 35,member 95662491 G1 929 HPGDS chr4: 95219706- 5.71 3.35 −0.77 −0.893.65E−03 3.14E−02 hematopoietic prostaglandin D 95264027 synthase 930NOTCH1 chr9: 139388895- 13.34 7.78 −0.78 −1.49 5.00E−05 7.97E−04 notch 1139440238 931 CPT1A chr11: 68522087- 93.66 54.51 −0.78 −1.31 5.00E−057.97E−04 carnitine palmitoyltransferase 1A 68609399 (liver) 932 HR chr8:21971931- 19.15 11.13 −0.78 −1.46 5.00E−05 7.97E−04 hair growthassociated 21988565 933 KRT12 chr17: 39017429- 4.53 2.63 −0.78 −0.886.50E−04 7.69E−03 keratin 12, type I 39023462 934 KITLG chr12: 88886569-13.52 7.86 −0.78 −1.38 5.00E−05 7.97E−04 KIT ligand 88974250 935 SLC39A5chr12: 56623819- 107.72 62.56 −0.78 −0.93 4.00E−03 3.39E−02 solutecarrier family 39 (zinc 56652143 transporter), member 5 936 E2F2 chr1:23832919- 7.97 4.62 −0.79 −1.26 5.00E−05 7.97E−04 E2F transcriptionfactor 2 23857712 937 TBC1D9 chr4: 141541935- 6.23 3.60 −0.79 −1.245.00E−05 7.97E−04 TBC1 domain family, member 9 141677471 (with GRAMdomain) 938 CDX2 chr13: 28536204- 154.63 89.40 −0.79 −1.31 5.00E−057.97E−04 caudal type homeobox 2 28543505 939 ACSF2 chr17: 48503518-51.16 29.40 −0.80 −0.91 4.25E−03 3.56E−02 acyl-CoA synthetase family48552206 member 2 940 ZFP3 chr17: 4981753- 5.11 2.94 −0.80 −1.205.00E−05 7.97E−04 ZFP3 zinc finger protein 4999669 941 TSPAN7 chrX:38420730- 64.64 37.09 −0.80 −1.37 5.00E−05 7.97E−04 tetraspanin 738548172 942 KCNJ2 chr17: 68165675- 3.48 2.00 −0.80 −1.09 1.00E−041.51E−03 potassium channel, inwardly 68176183 rectifying subfamily J,member 2 943 PPP1R14C chr6: 150464187- 19.38 11.10 −0.80 −1.28 5.00E−057.97E−04 protein phosphatase 1, regulatory 150571528 (inhibitor) subunit14C 944 WDR78 chr1: 67278571- 5.47 3.13 −0.80 −1.17 2.50E−04 3.38E−03 WDrepeat domain 78 67390570 945 SATB2 chr2: 200134222- 40.25 23.04 −0.81−1.14 5.00E−05 7.97E−04 SATB homeobox 2 200337481 946 AIFM3 chr22:21319417- 41.52 23.75 −0.81 −1.17 5.00E−05 7.97E−04 apoptosis-inducingfactor, 21335649 mitochondrion-associated, 3 947 SCAMP5 chr15: 75287875-7.90 4.51 −0.81 −1.07 2.50E−04 3.38E−03 secretory carrier membraneprotein 75313836 5 948 ZNF606 chr19: 58488440- 2.16 1.23 −0.81 −0.883.00E−03 2.69E−02 zinc finger protein 606 58518574 949 C10orf99 chr10:85933553- 314.00 178.88 −0.81 −1.13 5.00E−05 7.97E−04 chromosome 10 openreading 85945050 frame 99 950 RNLS chr10: 90033620- 4.75 2.71 −0.81−1.01 1.40E−03 1.45E−02 renalase, FAD-dependent amine 90343082 oxidase951 HRCT1 chr9: 35906188- 58.04 32.98 −0.82 −1.25 5.00E−05 7.97E−04histidine rich carboxyl terminus 1 35907138 952 SYNM chr15: 99645285-1.62 0.91 −0.83 −0.98 3.00E−04 3.97E−03 synemin, intermediate filament99675800 protein 953 USP32P2 chr17: 18414575- 0.97 0.55 −0.83 −0.773.80E−03 3.25E−02 ubiquitin specific peptidase 32 18424566 pseudogene 2954 SYNPO chr5: 149980641- 30.81 17.30 −0.83 −1.54 5.00E−05 7.97E−04synaptopodin 150038792 955 CYCS chr7: 25158269- 114.56 64.22 −0.84 −1.305.00E−05 7.97E−04 cytochrome c, somatic 25164980 956 HOXD4 chr2:177016112- 5.77 3.23 −0.84 −0.86 1.50E−03 1.53E−02 homeobox D4 177017949957 PAQR5 chr15: 69591293- 18.45 10.33 −0.84 −1.48 5.00E−05 7.97E−04progestin and adipoQ receptor 69699976 family member V 958 SLC4A4 chr4:72053002- 43.20 24.14 −0.84 −1.33 5.00E−05 7.97E−04 solute carrierfamily 4 (sodium 72437804 bicarbonate cotransporter), member 4 959LRRC26 chr9: 140033608- 50.87 28.28 −0.85 −1.12 8.00E−04 9.10E−03leucine rich repeat containing 26 140064491 960 GDPD1 chr17: 57297827-2.58 1.43 −0.85 −0.82 5.75E−03 4.55E−02 glycerophosphodiester 57353330phosphodiesterase domain containing 1 961 DHRS11 chr17: 34948225- 158.5787.93 −0.85 −1.39 5.00E−05 7.97E−04 dehydrogenase/reductase (SDR349572339 family) member 11 962 LGI4 chr19: 35615416- 6.00 3.32 −0.85−0.94 6.50E−04 7.69E−03 leucine-rich repeat LGI family, 35626178 member4 963 INPP5J chr22: 31518908- 18.03 9.99 −0.85 −1.37 5.00E−05 7.97E−04inositol polyphosphate-5- 31530683 phosphatase J 964 AMOT chrX:112018104- 4.18 2.32 −0.85 −1.25 5.00E−05 7.97E−04 angiomotin 112084043965 BCL2L15 chr1: 114356432- 40.09 22.19 −0.85 −1.29 5.00E−05 7.97E−04BCL2-like 15 114447741 966 PDE4C chr19: 18318770- 2.98 1.65 −0.86 −1.062.00E−04 2.80E−03 phosphodiesterase 4C, cAMP- 18359010 specific 967CYP27A1 chr2: 219646471- 27.19 14.98 −0.86 −1.39 5.00E−05 7.97E−04cytochrome P450, family 27, 219680016 subfamily A, polypeptide 1 968SLC19A3 chr2: 228549925- 4.07 2.23 −0.87 −1.05 2.50E−04 3.38E−03 solutecarrier family 19 (thiamine 228582745 transporter), member 3 969 SEMA5Achr5: 9035137- 6.75 3.70 −0.87 −1.45 5.00E−05 7.97E−04 sema domain,seven 9546233 thrombospondin repeats (type 1 and type 1-like),transmembrane domain (TM) and short cytoplasmic domain, (semaphorin) 5A970 SH3BP1 chr22: 38035683- 36.87 20.12 −0.87 −1.41 5.00E−05 7.97E−04SH3-domain binding protein 1 38054384 971 TTPA chr8: 63972047- 3.44 1.88−0.87 −0.98 1.00E−03 1.10E−02 tocopherol (alpha) transfer 63998612protein 972 GLIS3 chr9: 3824127- 1.43 0.78 −0.88 −0.95 5.00E−04 6.13E−03GLIS family zinc finger 3 4300035 973 RPS6KA6 chrX: 83313353- 3.55 1.93−0.88 −1.43 5.00E−05 7.97E−04 ribosomal protein S6 kinase, 83442943 90kDa, polypeptide 6 974 ZNF518B chr4: 10441503- 2.13 1.15 −0.88 −1.185.00E−05 7.97E−04 zinc finger protein 518B 10459032 975 EFNA5 chr5:106712589- 2.23 1.21 −0.88 −1.06 5.00E−05 7.97E−04 ephrin-A5 107006596976 MIPOL1 chr14: 37667117- 3.04 1.64 −0.89 −1.30 5.00E−05 7.97E−04mirror-image polydactyly 1 38020464 977 CNN3 chr1: 95362504- 68.19 36.78−0.89 −1.62 5.00E−05 7.97E−04 calponin 3, acidic 95392779 978 PHLPP2chr16: 71678828- 21.57 11.62 −0.89 −1.58 5.00E−05 7.97E−04 PH domain andleucine rich repeat 71758604 protein phosphatase 2 979 PPP2R3A chr3:135684514- 5.34 2.88 −0.89 −1.27 1.00E−04 1.51E−03 protein phosphatase2, regulatory 135866752 subunit B″, alpha 980 NCKAP5 chr2: 133429371-1.00 0.54 −0.90 −0.85 3.40E−03 2.97E−02 NCK-associated protein 5134326031 981 SH2D7 chr15: 78384926- 3.31 1.78 −0.90 −0.89 3.20E−032.82E−02 SH2 domain containing 7 78396393 982 RNF157 chr17: 74132414-7.83 4.19 −0.90 −1.30 5.00E−05 7.97E−04 ring finger protein 157 74236390983 CAMK1D chr10: 12391582- 7.03 3.76 −0.90 −1.09 4.00E−04 5.06E−03calcium/calmodulin-dependent 12871733 protein kinase ID 984 TMEM171chr5: 72416387- 87.05 46.45 −0.91 −1.58 5.00E−05 7.97E−04 transmembraneprotein 171 72427644 985 GPM6B chrX: 13789061- 2.91 1.55 −0.91 −0.786.20E−03 4.85E−02 glycoprotein M6B 13956831 986 SLC51B chr15: 65337707-108.87 57.67 −0.92 −1.45 5.00E−05 7.97E−04 solute carrier family 51,beta 65360388 subunit 987 PLAGL1 chr6: 144261436- 19.36 10.24 −0.92−1.52 5.00E−05 7.97E−04 pleiomorphic adenoma gene-like 1 144385736 988ACVR1C chr2: 158383278- 2.12 1.12 −0.92 −1.29 5.00E−05 7.97E−04 activinA receptor, type IC 158485399 989 SEMA6D chr15: 47476402- 12.04 6.34−0.93 −1.52 5.00E−05 7.97E−04 sema domain, transmembrane 48066420 domain(TM), and cytoplasmic domain, (semaphorin) 6D 990 ANO7 chr2: 242127923-24.00 12.62 −0.93 −1.39 5.00E−05 7.97E−04 anoctamin 7 242164791 991GAREML chr2: 26395959- 2.92 1.53 −0.93 −1.02 8.00E−04 9.10E−03 GRB2associated, regulator of 26412532 MAPK1-like 992 PROM2 chr2: 95940200-19.61 10.29 −0.93 −1.63 5.00E−05 7.97E−04 prominin 2 95957055 993TMEM38B chr9: 108456805- 9.34 4.90 −0.93 −1.45 5.00E−05 7.97E−04transmembrane protein 38B 108538892 994 RMDN2 chr2: 38152461- 13.93 7.29−0.93 −1.44 5.00E−05 7.97E−04 regulator of microtubule dynamics 382942852 995 SLC22A5 chr5: 131630144- 35.37 18.48 −0.94 −1.53 5.00E−05 7.97E−04solute carrier family 22 (organic 131731306 cation/carnitinetransporter), member 5 996 HDC chr15: 50534145- 3.33 1.73 −0.95 −0.972.00E−04 2.80E−03 histidine decarboxylase 50558162 997 SOWAHA chr5:132149015- 17.81 9.13 −0.96 −1.63 5.00E−05 7.97E−04 sosondowah ankyrinrepeat 132152489 domain family member A 998 PPFIA3 chr19: 49622645- 6.163.16 −0.97 −1.40 5.00E−05 7.97E−04 protein tyrosine phosphatase,49654287 receptor type, f polypeptide (PTPRF), interacting protein(liprin), alpha 3 999 CA2 chr8: 86376130- 1265.25 647.13 −0.97 −1.115.00E−05 7.97E−04 carbonic anhydrase II 86393721 1000 BEST2 chr19:12863406- 40.64 20.75 −0.97 −1.20 2.50E−04 3.38E−03 bestrophin 212869271 1001 ADTRP chr6: 11713887- 94.06 48.00 −0.97 −1.65 5.00E−057.97E−04 androgen-dependent TFPI- 11779280 regulating protein 1002HSD11B2 chr16: 67465035- 445.70 226.79 −0.97 −1.03 9.00E−04 1.01E−02hydroxysteroid (11-beta) 67471454 dehydrogenase 2 1003 MPP7 chr10:28339922- 12.28 6.22 −0.98 −1.72 5.00E−05 7.97E−04 membrane protein,palmitoylated 7 28571067 (MAGUK p55 subfamily member 7) 1004 ADAMTSL1chr9: 18474078- 1.83 0.93 −0.98 −1.19 5.00E−05 7.97E−04 ADAMTS-like 118910947 1005 ZNF575 chr19: 44037339- 12.36 6.25 −0.98 −1.24 5.00E−057.97E−04 zinc finger protein 575 44040284 1006 MPV17L chr16: 15489610-4.08 2.05 −0.99 −1.06 3.00E−04 3.97E−03 MPV17 mitochondrial membrane15503543 protein-like 1007 CEACAM7 chr19: 42177234- 1097.39 550.57 −1.00−1.01 8.50E−04 9.59E−03 carcinoembryonic antigen-related 42192206 celladhesion molecule 7 1008 METTL7A chr12: 51318533- 67.39 33.59 −1.00−1.79 5.00E−05 7.97E−04 methyltransferase like 7A 51326300 1009 MPZchr1: 161274524- 2.83 1.41 −1.01 −0.90 1.70E−03 1.71E−02 myelin proteinzero 161279762 1010 CNTN4 chr3: 2140549- 3.10 1.53 −1.02 −1.26 5.00E−057.97E−04 contactin 4 3099645 1011 CHRNA1 chr2: 175612322- 1.70 0.84−1.02 −0.84 4.55E−03 3.75E−02 cholinergic receptor, nicotinic, 175629200alpha 1 (muscle) 1012 PKIB chr6: 122793061- 120.04 59.26 −1.02 −1.805.00E−05 7.97E−04 protein kinase (cAMP-dependent, 123047518 catalytic)inhibitor beta 1013 NLRP2 chr19: 55476651- 5.55 2.74 −1.02 −1.265.00E−05 7.97E−04 NLR family, pyrin domain 55512510 containing 2 1014AZIN2 chr1: 33546713- 5.22 2.57 −1.02 −1.09 2.50E−04 3.38E−03 antizymeinhibitor 2 33585995 1015 PBLD chr10: 70042416- 46.52 22.90 −1.02 −0.941.65E−03 1.66E−02 phenazine biosynthesis-like protein 70167051 domaincontaining 1016 SCN7A chr2: 167260082- 1.12 0.55 −1.03 −1.13 2.00E−042.80E−03 sodium channel, voltage gated, 167343481 type VII alpha subunit1017 ARHGAP44 chr17: 12569206- 21.26 10.42 −1.03 −1.11 1.15E−03 1.24E−02Rho GTPase activating protein 44 12921381 1018 NPY1R chr4: 164245116-6.45 3.15 −1.03 −1.36 5.00E−05 7.97E−04 neuropeptide Y receptor Y1164253947 1019 SLC2A5 chr1: 9097004- 5.15 2.51 −1.04 −1.14 1.00E−041.51E−03 solute carrier family 2 (facilitated 9129887 glucose/fructosetransporter), member 5 1020 APOBEC3B chr22: 39353526- 15.42 7.46 −1.05−1.18 1.00E−04 1.51E−03 apolipoprotein B mRNA editing 39394225 enzyme,catalytic polypeptide-like 3B 1021 TINCR chr19: 5558177- 2.98 1.44 −1.05−1.14 2.00E−04 2.80E−03 tissue differentiation-inducing non- 5568005protein coding RNA 1022 PCDH20 chr13: 61983818- 3.32 1.58 −1.07 −1.245.00E−05 7.97E−04 protocadherin 20 61989655 1023 KLK3 chr19: 51358170-2.92 1.39 −1.07 −0.96 2.20E−03 2.11E−02 kallikrein-related peptidase 351364020 1024 PPARGC1B chr5: 149109814- 7.44 3.53 −1.08 −1.85 5.00E−057.97E−04 peroxisome proliferator-activated 149234585 receptor gamma,coactivator 1 beta 1025 TMEM56 chr1: 95558072- 20.12 9.49 −1.08 −1.685.00E−05 7.97E−04 transmembrane protein 56 95712781 1026 LOC102723344chr15: 63682428- 4.67 2.20 −1.08 −1.02 5.00E−04 6.13E−03 uncharacterizedLOC102723344 63729735 1027 FAM189A1 chr15: 29412454- 3.64 1.70 −1.10−1.10 1.30E−03 1.37E−02 family with sequence similarity 29862927 189,member A1 1028 CWH43 chr4: 48988264- 20.77 9.72 −1.10 −1.63 5.00E−057.97E−04 cell wall biogenesis 43 C-terminal 49064095 homolog 1029 PDE6Achr5: 149237518- 2.51 1.17 −1.10 −1.34 5.00E−05 7.97E−04phosphodiesterase 6A, cGMP- 149324356 specific, rod, alpha 1030 PTGDR2chr11: 60609428- 8.13 3.77 −1.11 −1.15 3.50E−04 4.55E−03 prostaglandinD2 receptor 2 60623444 1031 MESP2 chr15: 90319588- 1.76 0.80 −1.14 −0.882.50E−03 2.33E−02 mesoderm posterior bHLH 90321982 transcription factor2 1032 CNTN3 chr3: 74311721- 2.70 1.22 −1.15 −1.43 5.00E−05 7.97E−04contactin 3 (plasmacytoma 74570343 associated) 1033 CHP2 chr16:23765947- 211.39 94.76 −1.16 −1.71 5.00E−05 7.97E−04 calcineurin-likeEF-hand 23770272 protein 2 1034 AQP8 chr16: 25228284- 1108.72 496.18−1.16 −0.70 2.05E−03 1.98E−02 aquaporin 8 25240253 1035 NEURL1B chr5:172068275- 32.47 14.49 −1.16 −2.03 5.00E−05 7.97E−04 neuralized E3ubiquitin protein 172118533 ligase 1B 1036 HOXD3 chr2: 177028804- 1.230.55 −1.16 −0.89 3.20E−03 2.82E−02 homeobox D3 177037826 1037 DRAICchr15: 69854058- 2.07 0.92 −1.17 −1.04 2.50E−04 3.38E−03 downregulatedRNA in cancer, 69863779 inhibitor of cell invasion and migration 1038ASPG chr14: 104552022- 7.31 3.23 −1.18 −1.00 1.95E−03 1.91E−02asparaginase 104579046 1039 VSTM1 chr19: 54544079- 1.61 0.71 −1.18 −0.742.75E−03 2.52E−02 V-set and transmembrane domain 54567207 containing 11040 SYP chrX: 49044264- 4.51 1.99 −1.18 −1.24 5.00E−05 7.97E−04synaptophysin 49058913 1041 CLCA1 chr1: 86934525- 905.45 397.19 −1.19−0.94 4.00E−04 5.06E−03 chloride channel accessory 1 86965974 1042 KCNG1chr20: 49620192- 3.48 1.48 −1.23 −1.06 1.30E−03 1.37E−02 potassiumchannel, voltage gated 49639675 modifier subfamily G, member 1 1043SLC26A2 chr5: 149340299- 264.43 111.38 −1.25 −1.23 5.00E−05 7.97E−04solute carrier family 26 (anion 149366963 exchanger), member 2 1044EPB41L3 chr18: 5392379- 22.76 9.32 −1.29 −2.15 5.00E−05 7.97E−04erythrocyte membrane protein 5628990 band 4.1-like 3 1045 MSI1 chr12:120779132- 1.27 0.52 −1.29 −1.04 1.15E−03 1.24E−02 musashi RNA-bindingprotein 1 120806983 1046 SOX10 chr22: 38368318- 1.56 0.63 −1.30 −1.131.25E−03 1.33E−02 SRY (sex determining region Y)- 38380539 box 10 1047SUGCT chr7: 40174574- 7.67 3.05 −1.33 −1.47 5.00E−05 7.97E−04succinyl-CoA: glutarate-CoA 40900366 transferase 1048 GFRA3 chr5:137588068- 1.51 0.59 −1.37 −1.06 4.00E−04 5.06E−03 GDNF family receptoralpha 3 137610253 1049 TMEM236 chr10: 17851341- 0.95 0.36 −1.41 −1.181.00E−04 1.51E−03 transmembrane protein 236 18200091 1050 UGT2A3chr4_ctg9_hap1: 46.59 17.22 −1.44 −2.28 5.00E−05 7.97E−04 UDPglucuronosyltransferase 2 506427- family, polypeptide A3 529760 1051OTUD7A chr15: 31775328- 1.37 0.47 −1.54 −1.30 5.00E−05 7.97E−04 OTUdeubiquitinase 7A 31947542 1052 IL13RA2 chrX: 114238537- 1.20 0.40 −1.59−1.00 3.00E−03 2.69E−02 interleukin 13 receptor, alpha 2 114252207 1053HAVCR1 chr5: 156456530- 2.55 0.84 −1.59 −1.11 6.10E−03 4.79E−02hepatitis A virus cellular 156485970 receptor 1 1054 HTR4 chr5:147830594- 4.13 1.29 −1.68 −1.43 1.00E−03 1.10E−02 5-hydroxytryptamine(serotonin) 148034090 receptor 4, G protein-coupled 1055 GPR143 chrX:9693452- 1.59 0.43 −1.87 −1.34 5.00E−05 7.97E−04 G protein-coupledreceptor 143 9734005 1056 TRIM9 chr14: 51441980- 1.49 0.40 −1.91 −1.591.50E−04 2.18E−03 tripartite motif containing 9 51562422 1057 MIR6506chr16: 15688225- 22.14 0.00 −Inf NA 1.70E−03 1.71E−02 microRNA 650615737023 1058 MIR6739 chr1: 201617449- 107.57 0.00 −Inf NA 3.50E−044.55E−03 microRNA 6739 201853422

TABLE 7 Differentially expressed genes in three comparisons. InCuffdiff2, samples are normalized for differences in library sizesrelative to each other and therefore the normalized expression isaffected by which samples are included in the comparison. For thisreason mean expression of a gene under one phenotype can appear slightlydifferent in different comparisons. HP versus SSA/P gene locus mean_HPmean_SSA/P log₂FC p_value p_(adj)  1 ABTB2 chr11:34172533-34379555 8.755.20 −0.75 5.00E−05 2.53E−03  2 ADRA2A chr10:112836789-112840662 6.2215.15 1.28 5.00E−05 2.53E−03  3 ALDH1A1 chr9:75515577-75568233 25.8152.32 1.02 5.00E−05 2.53E−03  4 ALDH1L1 chr3:125822403-125929011 11.6844.11 1.92 5.00E−05 2.53E−03  5 ALDOB chr9:104182841-104198062 40.96111.78 1.45 2.00E−04 8.55E−03  6 ALDOC chr17:26900132-26903951 28.4517.11 −0.73 1.50E−04 6.74E−03  7 APOBEC1 chr12:7801995-7818502 4.0624.17 2.57 5.00E−05 2.53E−03  8 ARSJ chr4:114821439-114900878 3.88 1.74−1.15 5.00E−05 2.53E−03  9 ATF3 chr1:212738675-212794119 41.76 13.43−1.64 5.00E−05 2.53E−03  10 B3GNT7 chr2:232260334-232265875 114.58 15.38−2.90 5.00E−05 2.53E−03  11 B4GALNT2 chr17:47209821-47247351 8.12 22.201.45 5.00E−05 2.53E−03  12 C12orf75 chr12:105724413-105765296 87.7056.94 −0.62 1.00E−04 4.83E−03  13 B3GALT5-AS1 chr21:40969074-4098474915.21 2.81 −2.44 5.00E−05 2.53E−03  14 C4BPB chr1:207262211-2072733377.74 24.18 1.64 5.00E−05 2.53E−03  15 CCL13 chr17:32683470-32685629 7.1621.01 1.55 5.00E−05 2.53E−03  16 CD55 chr1:207494816-207534311 93.25188.00 1.01 1.00E−04 4.83E−03  17 CDA chr1:20915443-20945400 87.05 55.46−0.65 1.00E−04 4.83E−03  18 CHGB chr20:5891973-5906005 9.53 2.88 −1.735.00E−05 2.53E−03  19 CHST5 chr16:75562427-75569068 67.08 36.24 −0.895.00E−05 2.53E−03  20 CLC chr19:40221892-40228669 3.48 11.71 1.755.00E−05 2.53E−03  21 CLDN8 chr21:31586323-31588469 35.70 3.80 −3.235.00E−05 2.53E−03  22 CNNM2 chr10:104678074-104838344 7.99 4.63 −0.796.00E−04 1.99E−02  23 COL18A1 chr21:46825096-46933634 7.26 14.11 0.965.00E−05 2.53E−03  24 COL5A3 chr19:10070236-10121147 1.50 2.51 0.746.50E−04 2.11E−02  25 CPB1 chr3:148545587-148577972 25.47 1.31 −4.285.00E−05 2.53E−03  26 CPNE8 chr12:39046001-39299420 9.30 4.27 −1.125.00E−05 2.53E−03  27 CTGF chr6:132269316-132272518 30.53 17.58 −0.805.00E−05 2.53E−03  28 CYP2C18 chr10:96443250-96495947 5.42 9.35 0.794.00E−04 1.49E−02  29 CYP2C9 chr10:96698414-96749148 2.20 5.77 1.395.00E−05 2.53E−03  30 CYP2W1 chr7:1022834-1029276 3.99 1.22 −1.715.00E−05 2.53E−03  31 CYP3A5 chr7:99245811-99277649 86.67 134.53 0.631.70E−03 4.44E−02  32 EFNA3 chr1:155051347-155060014 13.94 5.42 −1.365.00E−05 2.53E−03  33 EGR1 chr5:137801180-137805004 38.90 12.69 −1.625.00E−05 2.53E−03  34 ETNK1 chr12:22778075-22843608 21.99 48.21 1.135.00E−05 2.53E−03  35 FAM213A chr10:82167584-82192753 31.10 47.73 0.621.10E−03 3.22E−02  36 FAM3D chr3:58619669-58652561 559.43 349.35 −0.686.50E−04 2.11E−02  37 FER1L4 chr20:34146506-34195484 3.98 7.26 0.875.00E−05 2.53E−03  38 FFAR4 chr10:95326421-95349829 34.44 14.86 −1.215.00E−05 2.53E−03  39 FOS chr14:75745480-75748937 188.02 46.34 −2.025.00E−05 2.53E−03  40 FOSB chr19:45971252-45978437 9.39 1.96 −2.265.00E−05 2.53E−03  41 FOXA2 chr20:22561641-22566101 13.89 7.52 −0.885.00E−05 2.53E−03  42 FOXQ1 chr6:1312674-1314993 2.47 12.74 2.375.00E−05 2.53E−03  43 FREM1 chr9:14734663-14910993 0.31 2.68 3.145.00E−05 2.53E−03  44 FRMD3 chr9:85857904-86153348 10.47 5.49 −0.935.00E−05 2.53E−03  45 FSCN1 chr7:5632435-5646287 5.43 20.43 1.915.00E−05 2.53E−03  46 GBA3 chr4:22694536-22821195 3.97 7.33 0.881.25E−03 3.53E−02  47 GBP5 chr1:89724633-89738544 1.41 3.01 1.095.00E−05 2.53E−03  48 GDF15 chr19:18496967-18499986 28.32 14.08 −1.015.00E−05 2.53E−03  49 GPC3 chrX:132669775-133119673 0.34 2.89 3.085.00E−05 2.53E−03  50 ADGRF1 chr6:46967812-47010082 2.99 7.66 1.355.00E−05 2.53E−03  51 H19 chr11:2016405-2019065 0.95 2.94 1.63 5.00E−052.53E−03  52 HOXB13 chr17:46802126-46806111 71.01 11.46 −2.63 5.00E−052.53E−03  53 HSD3B2 chr1:119957553-119965662 0.82 4.19 2.36 5.00E−041.75E−02  54 HSPA2 chr14:65007185-65009954 11.23 5.77 −0.96 5.00E−052.53E−03  55 IGFBP2 chr2:217498126-217529158 46.82 82.33 0.81 5.00E−052.53E−03  56 IGFBP5 chr2:217536827-217560272 5.58 8.61 0.62 4.00E−041.49E−02  57 INSL5 chr1:67263423-67266942 335.40 9.96 −5.07 5.00E−052.53E−03  58 JUN chr1:59246462-59249785 62.51 38.23 −0.71 5.00E−041.75E−02  59 KLF8 chrX:56258821-56314322 0.76 1.61 1.08 3.50E−041.32E−02  60 L1TD1 chr1:62660473-62678001 2.01 6.07 1.59 5.00E−052.53E−03  61 LINC00261 chr20:22541191-22559280 18.71 12.55 −0.588.00E−04 2.53E−02  62 LOC283177 chr11:134306375-134375555 1.38 3.09 1.166.00E−04 1.99E−02  63 LOC284454 chr19:13945329-13947173 22.05 12.85−0.78 3.00E−04 1.18E−02  64 LOC389602 chr7:155755325-155759037 5.8010.59 0.87 1.50E−04 6.74E−03  65 MFAP5 chr12:8798539-8815433 3.98 1.85−1.11 5.00E−05 2.53E−03  66 MFSD4 chr1:205538111-205572046 22.34 9.75−1.20 5.00E−05 2.53E−03  67 MROH6 chr8:144648362-144654928 7.36 11.870.69 3.50E−04 1.32E−02  68 MS4A12 chr11:60260250-60274901 328.68 160.33−1.04 5.00E−05 2.53E−03  69 MUC12 chr7:100612903-100662230 75.53 21.34−1.82 5.00E−05 2.53E−03  70 MUC17 chr7:100663363-100702140 22.05 71.191.69 5.00E−05 2.53E−03  71 NOX1 chrX:100098312-100129334 61.82 40.86−0.60 3.00E−04 1.18E−02  72 NPY6R chr5:137136881-137146439 1.27 3.151.31 1.10E−03 3.22E−02  73 NQO1 chr16:69743303-69760571 76.50 144.600.92 5.00E−05 2.53E−03  74 NR1H4 chr12:100867550-100957645 2.59 6.411.31 2.50E−04 1.03E−02  75 NR4A1 chr12:52416615-52453291 56.79 8.93−2.67 5.00E−05 2.53E−03  76 NR4A2 chr2:157180943-157189287 10.30 2.09−2.30 5.00E−05 2.53E−03  77 NT5DC3 chr12:104166080-104234975 3.62 2.27−0.67 9.50E−04 2.87E−02  78 PCSK1 chr5:95726039-95768985 3.08 1.18 −1.395.00E−05 2.53E−03  79 PDE3A chr12:20522178-20837041 7.57 3.95 −0.945.00E−05 2.53E−03  80 PDZK1IP1 chr1:47649260-47655771 106.93 266.07 1.325.00E−05 2.53E−03  81 PITX2 chr4:111538579-111563279 1.46 12.53 3.105.00E−05 2.53E−03  82 PLLP chr16:57290008-57318584 57.05 33.66 −0.765.00E−05 2.53E−03  83 PP7080 chr5:470624-473080 92.49 199.57 1.115.00E−05 2.53E−03  84 PPP1R12B chr1:202317829-202557697 15.60 10.54−0.57 1.05E−03 3.10E−02  85 PPP1R15A chr19:49375648-49379319 36.17 21.95−0.72 1.00E−04 4.83E−03  86 PRAC1 chr17:46799081-46799882 152.89 31.74−2.27 5.00E−05 2.53E−03  87 PTGDS chr9:139871955-139876194 8.40 16.791.00 2.00E−04 8.55E−03  88 RBP4 chr10:95351592-95360993 23.68 9.91 −1.265.00E−05 2.53E−03  89 RGS1 chr1:192544856-192549159 14.62 7.62 −0.945.00E−05 2.53E−03  90 RHBDL2 chr1:39351478-39407456 32.16 19.88 −0.691.50E−04 6.74E−03  91 SCG2 chr2:224461657-224467217 10.83 2.68 −2.015.00E−05 2.53E−03  92 SDR16C5 chr8:57212569-57233241 23.24 35.78 0.621.50E−03 4.05E−02  93 SIDT1 chr3:113251217-113348422 14.17 8.96 −0.661.50E−04 6.74E−03  94 SIK4 chr21:44834397-44847002 16.20 3.98 −2.035.00E−05 2.53E−03  95 SLC14A2 chr18:42792946-43263060 0.13 2.42 4.225.00E−05 2.53E−03  96 SLC15A1 chr13:99336054-99404929 16.00 7.95 −1.015.00E−05 2.53E−03  97 SLC37A2 chr11:124933012-124960412 7.36 37.13 2.345.00E−05 2.53E−03  98 SLC51A chr3:195943382-195960301 13.44 28.16 1.071.35E−03 3.76E−02  99 SLC9A3 chr5:473333-524549 49.72 114.65 1.215.00E−05 2.53E−03 100 SPINK5 chr5:147443534-147516925 15.59 5.13 −1.605.00E−05 2.53E−03 101 SPON1 chr11:13984183-14289679 43.37 20.70 −1.075.00E−05 2.53E−03 102 ST3GAL4 chr11:126225539-126284536 162.38 66.72−1.28 5.00E−05 2.53E−03 103 ST6GALNAC6 chr9:130647599-130667627 276.43116.79 −1.24 5.00E−05 2.53E−03 104 STOM chr9:124101265-124132582 19.5648.34 1.31 5.00E−05 2.53E−03 105 SULT1C2 chr2:108905094-108926371 11.2230.35 1.44 5.00E−05 2.53E−03 106 SULT2B1 chr19:49055428-49102684 2.646.07 1.20 5.00E−05 2.53E−03 107 TBX10 chr11:67398773-67407031 18.20 8.82−1.05 5.00E−05 2.53E−03 108 TFCP2L1 chr2:121974163-122042778 28.31 15.57−0.86 5.00E−05 2.53E−03 109 THRB chr3:24158644-24541502 5.39 2.79 −0.955.00E−05 2.53E−03 110 TM4SF20 chr2:228226873-228244022 5.69 33.22 2.545.00E−05 2.53E−03 111 TMC5 chr16:19422056-19510434 20.27 30.20 0.581.20E−03 3.40E−02 112 TMEM200A chr6:130687425-130764210 11.88 3.93 −1.605.00E−05 2.53E−03 113 TMEM231 chr16:75572014-75590184 9.16 4.54 −1.015.00E−05 2.53E−03 114 TMIGD1 chr17:28643365-28661065 71.95 38.40 −0.915.00E−05 2.53E−03 115 TNNC1 chr3:52485106-52488057 10.03 1.09 −3.205.00E−05 2.53E−03 116 TPH1 chr11:18042083-18062335 6.70 1.81 −1.895.00E−05 2.53E−03 117 TUSC3 chr8:15397595-15624158 7.52 2.23 −1.755.00E−05 2.53E−03 118 UGT2B7 chr4:69962192-69978705 2.31 5.90 1.355.00E−05 2.53E−03 119 VNN1 chr6:133001996-133035194 3.75 14.48 1.955.00E−05 2.53E−03 120 VWA2 chr10:115999012-116054259 1.17 2.78 1.255.00E−05 2.53E−03 121 WFDC2 chr20:44098393-44110172 249.52 62.76 −1.995.00E−05 2.53E−03 CR versus SSA/P gene locus mean_CR mean_SSA/P log₂FCp_value p_(adj)  1 ABTB2 chr11:34172533-34379555 2.83 5.36 0.92 5.00E−057.97E−04  2 ADRA2A chr10:112836789-112840662 28.80 15.54 −0.89 5.00E−057.97E−04  3 ALDH1A1 chr9:75515577-75568233 78.28 54.36 −0.53 2.50E−043.38E−03  4 ALDH1L1 chr3:125822403-125929011 17.54 45.47 1.37 5.00E−057.97E−04  5 ALDOB chr9:104182841-104198062 14.59 116.11 2.99 5.00E−057.97E−04  6 ALDOC chr17:26900132-26903951 7.60 17.67 1.22 5.00E−057.97E−04  7 APOBEC1 chr12:7801995-7818502 8.30 25.07 1.59 5.00E−057.97E−04  8 ARSJ chr4:114821439-114900878 0.66 1.81 1.45 5.00E−057.97E−04  9 ATF3 chr1:212738675-212794119 4.84 13.90 1.52 5.00E−057.97E−04  10 B3GNT7 chr2:232260334-232265875 7.50 15.87 1.08 5.00E−057.97E−04  11 B4GALNT2 chr17:47209821-47247351 88.58 22.95 −1.95 5.00E−057.97E−04  12 C12orf 75 chr12:105724413-105765296 25.83 58.39 1.185.00E−05 7.97E−04  13 B3GALT5-AS1 chr21:40969074-40984749 0.54 2.90 2.425.00E−05 7.97E−04  14 C4BPB chr1:207262211-207273337 5.55 25.05 2.185.00E−05 7.97E−04  15 CCL13 chr17:32683470-32685629 42.01 21.58 −0.965.00E−05 7.97E−04  16 CD55 chr1:207494816-207534311 25.68 194.65 2.925.00E−05 7.97E−04  17 CDA chr1:20915443-20945400 21.65 57.17 1.405.00E−05 7.97E−04  18 CHGB chr20:5891973-5906005 8.07 2.99 −1.435.00E−05 7.97E−04  19 CHST5 chr16:75562427-75569068 12.57 37.20 1.565.00E−05 7.97E−04  20 CLC chr19:40221892-40228669 21.80 12.10 −0.851.50E−04 2.18E−03  21 CLDN8 chr21:31586323-31588469 1.70 3.93 1.215.50E−04 6.69E−03  22 CNNM2 chr10:104678074-104838344 7.37 4.80 −0.624.40E−03 3.65E−02  23 COL18A1 chr21:46825096-46933634 23.26 14.53 −0.681.00E−04 1.51E−03  24 COL5A3 chr19:10070236-10121147 4.06 2.57 −0.667.50E−04 8.64E−03  25 CPB1 chr3:148545587-148577972 0.10 1.36 3.795.00E−05 7.97E−04  26 CPNE8 chr12:39046001-39299420 6.68 4.44 −0.592.45E−03 2.29E−02  27 CTGF chr6:132269316-132272518 10.99 18.12 0.725.00E−05 7.97E−04  28 CYP2C18 chr10:96443250-96495947 17.00 9.69 −0.815.00E−05 7.97E−04  29 CYP2C9 chr10:96698414-96749148 11.44 5.96 −0.945.00E−05 7.97E−04  30 CYP2W1 chr7:1022834-1029276 0.44 1.25 1.505.00E−05 7.97E−04  31 CYP3A5 chr7:99245811-99277649 39.96 139.97 1.815.00E−05 7.97E−04  32 EFNA3 chr1:155051347-155060014 3.36 5.57 0.735.45E−03 4.37E−02  33 EGR1 chr5:137801180-137805004 4.44 13.08 1.565.00E−05 7.97E−04  34 ETNK1 chr12:22778075-22843608 134.13 49.89 −1.435.00E−05 7.97E−04  35 FAM213A chr10:82167584-82192753 70.97 49.42 −0.521.15E−03 1.24E−02  36 FAM3D chr3:58619669-58652561 251.01 361.22 0.531.65E−03 1.66E−02  37 FER1L4 chr20:34146506-34195484 4.42 7.45 0.755.00E−05 7.97E−04  38 FFAR4 chr10:95326421-95349829 6.41 15.32 1.265.00E−05 7.97E−04  39 FOS chr14:75745480-75748937 15.58 47.68 1.615.00E−05 7.97E−04  40 FOSB chr19:45971252-45978437 0.41 2.01 2.315.00E−05 7.97E−04  41 FOXA2 chr20:22561641-22566101 3.75 7.77 1.055.00E−05 7.97E−04  42 FOXQ1 chr6:1312674-1314993 1.83 13.06 2.845.00E−05 7.97E−04  43 FREM1 chr9:14734663-14910993 1.02 2.81 1.475.00E−05 7.97E−04  44 FRMD3 chr9:85857904-86153348 3.51 5.74 0.715.00E−05 7.97E−04  45 FSCN1 chr7:5632435-5646287 7.00 20.88 1.585.00E−05 7.97E−04  46 GBA3 chr4:22694536-22821195 21.87 7.54 −1.545.00E−05 7.97E−04  47 GBP5 chr1:89724633-89738544 1.95 3.12 0.684.35E−03 3.63E−02  48 GDF15 chr19:18496967-18499986 6.15 14.54 1.245.00E−05 7.97E−04  49 GPC3 chrX:132669775-133119673 9.40 2.99 −1.655.00E−05 7.97E−04  50 ADGRF1 chr6:46967812-47010082 1.19 8.02 2.755.00E−05 7.97E−04  51 H19 chr11:2016405-2019065 1.28 3.02 1.24 1.50E−031.53E−02  52 HOXB13 chr17:46802126-46806111 0.34 11.83 5.11 5.00E−057.97E−04  53 HSD3B2 chr1:119957553-119965662 18.40 4.32 −2.09 5.00E−057.97E−04  54 HSPA2 chr14:65007185-65009954 10.07 5.99 −0.75 5.00E−057.97E−04  55 IGFBP2 chr2:217498126-217529158 170.45 84.99 −1.00 5.00E−057.97E−04  56 IGFBP5 chr2:217536827-217560272 17.50 8.84 −0.99 5.00E−057.97E−04  57 INSL5 chr1:67263423-67266942 1.29 10.37 3.00 5.00E−057.97E−04  58 JUN chr1:59246462-59249785 25.77 39.35 0.61 5.00E−057.97E−04  59 KLF8 chrX:56258821-56314322 3.36 1.66 −1.02 5.00E−057.97E−04  60 L1TD1 chr1:62660473-62678001 10.17 6.32 −0.69 5.00E−046.13E−03  61 LINC00261 chr20:22541191-22559280 4.53 13.04 1.53 5.00E−057.97E−04  62 LOC283177 chr11:134306375-134375555 5.05 3.19 −0.661.70E−03 1.71E−02  63 LOC284454 chr19:13945329-13947173 7.45 13.21 0.831.00E−04 1.51E−03  64 LOC389602 chr7:155755325-155759037 1.88 10.94 2.545.00E−05 7.97E−04  65 MFAP5 chr12:8798539-8815433 0.62 1.94 1.655.00E−05 7.97E−04  66 MFSD4 chr1:205538111-205572046 6.56 10.06 0.623.00E−04 3.97E−03  67 MROH6 chr8:144648362-144654928 6.94 12.23 0.825.00E−05 7.97E−04  68 MS4A12 chr11:60260250-60274901 267.62 164.87 −0.704.00E−04 5.06E−03  69 MUC12 chr7:100612903-100662230 12.37 22.10 0.841.00E−04 1.51E−03  70 MUC17 chr7:100663363-100702140 2.42 74.14 4.945.00E−05 7.97E−04  71 NOX1 chrX:100098312-100129334 24.05 42.36 0.825.00E−05 7.97E−04  72 NPY6R chr5:137136881-137146439 4.98 3.25 −0.613.55E−03 3.07E−02  73 NQO1 chr16:69743303-69760571 92.08 149.67 0.705.00E−05 7.97E−04  74 NR1H4 chr12:100867550-100957645 17.72 6.62 −1.425.00E−05 7.97E−04  75 NR4A1 chr12:52416615-52453291 4.22 9.19 1.125.00E−05 7.97E−04  76 NR4A2 chr2:157180943-157189287 0.85 2.16 1.355.00E−05 7.97E−04  77 NT5DC3 chr12:104166080-104234975 1.04 2.35 1.175.00E−05 7.97E−04  78 PCSK1 chr5:95726039-95768985 0.58 1.22 1.081.00E−04 1.51E−03  79 PDE3A chr12:20522178-20837041 8.66 4.08 −1.095.00E−05 7.97E−04  80 PDZK1IP1 chr1:47649260-47655771 21.40 273.29 3.675.00E−05 7.97E−04  81 PITX2 chr4:111538579-111563279 45.04 12.98 −1.805.00E−05 7.97E−04  82 PLLP chr16:57290008-57318584 7.25 34.68 2.265.00E−05 7.97E−04  83 PP7080 chr5:470624-473080 564.44 206.58 −1.455.00E−05 7.97E−04  84 PPP1R12B chr1:202317829-202557697 6.15 10.90 0.835.00E−05 7.97E−04  85 PPP1R15A chr19:49375648-49379319 12.71 22.53 0.835.00E−05 7.97E−04  86 PRAC1 chr17:46799081-46799882 1.53 32.82 4.435.00E−05 7.97E−04  87 PTGDS chr9:139871955-139876194 31.17 17.27 −0.852.00E−04 2.80E−03  88 RBP4 chr10:95351592-95360993 4.66 10.05 1.115.00E−05 7.97E−04  89 RGS1 chr1:192544856-192549159 5.04 7.98 0.662.70E−03 2.48E−02  90 RHBDL2 chr1:39351478-39407456 11.06 20.55 0.895.00E−05 7.97E−04  91 SCG2 chr2:224461657-224467217 1.29 2.79 1.121.55E−03 1.57E−02  92 SDR16C5 chr8:57212569-57233241 2.46 37.01 3.915.00E−05 7.97E−04  93 SIDT1 chr3:113251217-113348422 5.95 9.23 0.635.00E−05 7.97E−04  94 SIK1 chr21 :44834397-44847002 1.75 4.09 1.225.00E−05 7.97E−04  95 SLC14A2 chr18:42792946-43263060 10.40 2.51 −2.055.00E−05 7.97E−04  96 SLC15A1 chr13:99336054-99404929 1.88 8.23 2.135.00E−05 7.97E−04  97 SLC37A2 chr11:124933012-124960412 196.43 38.34−2.36 5.00E−05 7.97E−04  98 SLC51A chr3:195943382-195960301 96.90 28.78−1.75 5.00E−05 7.97E−04  99 SLC9A3 chr5:473333-524549 398.94 120.06−1.73 5.00E−05 7.97E−04 100 SPINK5 chr5:147443534-147516925 1.87 5.361.52 5.00E−05 7.97E−04 101 SPON1 chr11:13984183-14289679 14.15 21.350.59 1.50E−04 2.18E−03 102 ST3GAL4 chr11:126225539-126284536 5.42 68.823.67 5.00E−05 7.97E−04 103 ST6GALNAC6 chr9:130647599-130667627 32.74120.19 1.88 5.00E−05 7.97E−04 104 STOM chr9:124101265-124132582 33.9950.37 0.57 1.30E−03 1.37E−02 105 SULT1C2 chr2:108905094-108926371 0.5831.49 5.77 5.00E−05 7.97E−04 106 SULT2B1 chr19:49055428-49102684 3.476.23 0.84 1.30E−03 1.37E−02 107 TBX10 chr11:67398773-67407031 3.73 9.041.28 5.00E−05 7.97E−04 108 TFCP2L1 chr2:121974163-122042778 23.00 16.01−0.52 6.50E−04 7.69E−03 109 THRB chr3:24158644-24541502 0.93 2.88 1.635.00E−05 7.97E−04 110 TM4SF20 chr2:228226873-228244022 3.33 34.74 3.385.00E−05 7.97E−04 111 TMC5 chr16:19422056-19510434 13.73 31.35 1.195.00E−05 7.97E−04 112 TMEM200A chr6:130687425-130764210 2.33 4.06 0.801.30E−03 1.37E−02 113 TMEM231 chr16:75572014-75590184 2.63 4.67 0.832.50E−04 3.38E−03 114 TMIGD1 chr17:28643365-28661065 80.33 39.78 −1.015.00E−05 7.97E−04 115 TNNC1 chr3:52485106-52488057 0.24 1.13 2.231.55E−03 1.57E−02 116 TPH1 chr11:18042083-18062335 3.84 1.86 −1.042.50E−04 3.38E−03 117 TUSC3 chr8:15397595-15624158 1.39 2.30 0.734.40E−03 3.65E−02 118 UGT2B7 chr4:69962192-69978705 1.61 6.19 1.945.00E−05 7.97E−04 119 VNN1 chr6:133001996-133035194 0.39 15.17 5.275.00E−05 7.97E−04 120 VWA2 chr10:115999012-116054259 1.18 2.85 1.275.00E−05 7.97E−04 121 WFDC2 chr20:44098393-44110172 29.07 64.83 1.165.00E−05 7.97E−04 CR versus CL gene locus mean_CR mean_CL log₂FC p_valuep_(adj)  1 ABTB2 chr11:34172533-34379555 2.92 6.12 1.07 5.00E−051.05E−03  2 ADRA2A chr10:112836789-112840662 29.66 8.07 −1.88 5.00E−051.05E−03  3 ALDH1A1 chr9:75515577-75568233 80.57 52.01 −0.63 5.00E−051.05E−03  4 ALDH1L1 chr3:125822403-125929011 18.09 6.30 −1.52 5.00E−051.05E−03  5 ALDOB chr9:104182841-104198062 15.05 2.93 −2.36 5.00E−051.05E−03  6 ALDOC chr17:26900132-26903951 7.84 21.10 1.43 5.00E−051.05E−03  7 APOBEC1 chr12:7801995-7818502 8.55 3.02 −1.50 5.00E−051.05E−03  8 ARSJ chr4:114821439-114900878 0.68 1.59 1.22 5.00E−051.05E−03  9 ATF3 chr1:212738675-212794119 5.00 20.46 2.03 5.00E−051.05E−03  10 B3GNT7 chr2:232260334-232265875 7.72 73.01 3.24 5.00E−051.05E−03  11 B4GALNT2 chr17:47209821-47247351 91.30 32.24 −1.50 5.00E−051.05E−03  12 C12orf75 chr12:105724413-105765296 26.58 71.16 1.425.00E−05 1.05E−03  13 B3GALT5-AS1 chr21:40969074-40984749 0.56 12.474.48 5.00E−05 1.05E−03  14 C4BPB chr1:207262211-207273337 5.71 3.25−0.82 2.75E−03 3.13E−02  15 CCL13 chr17:32683470-32685629 43.28 22.80−0.92 5.00E−05 1.05E−03  16 CD55 chr1:207494816-207534311 26.49 15.08−0.81 5.00E−05 1.05E−03  17 CDA chr1:20915443-20945400 22.30 51.39 1.205.00E−05 1.05E−03  18 CHGB chr20:5891973-5906005 8.32 4.84 −0.785.00E−05 1.05E−03  19 CHST5 chr16:75562427-75569068 12.95 44.11 1.775.00E−05 1.05E−03  20 CLC chr19:40221892-40228669 22.51 8.72 −1.375.00E−05 1.05E−03  21 CLDN8 chr21:31586323-31588469 1.75 50.41 4.855.00E−05 1.05E−03  22 CNNM2 chr10:104678074-104838344 7.59 11.76 0.631.50E−04 2.76E−03  23 COL18A1 chr21:46825096-46933634 23.98 10.17 −1.245.00E−05 1.05E−03  24 COL5A3 chr19:10070236-10121147 4.19 1.49 −1.505.00E−05 1.05E−03  25 CPB1 chr3:148545587-148577972 0.10 1.54 3.935.00E−05 1.05E−03  26 CPNE8 chr12:39046001-39299420 6.87 10.99 0.685.00E−05 1.05E−03  27 CTGF chr6:132269316-132272518 11.32 18.80 0.735.00E−05 1.05E−03  28 CYP2C18 chr10:96443250-96495947 17.49 5.79 −1.605.00E−05 1.05E−03  29 CYP2C9 chr10:96698414-96749148 11.77 2.07 −2.505.00E−05 1.05E−03  30 CYP2W1 chr7:1022834-1029276 0.45 2.27 2.325.00E−05 1.05E−03  31 CYP3A5 chr7:99245811-99277649 41.12 67.62 0.725.00E−05 1.05E−03  32 EFNA3 chr1:155051347-155060014 3.46 7.52 1.125.00E−05 1.05E−03  33 EGR1 chr5:137801180-137805004 4.57 15.96 1.805.00E−05 1.05E−03  34 ETNK1 chr12:22778075-22843608 137.99 35.89 −1.945.00E−05 1.05E−03  35 FAM213A chr10:82167584-82192753 73.04 50.96 −0.525.00E−05 1.05E−03  36 FAM3D chr3:58619669-58652561 258.54 409.90 0.665.00E−05 1.05E−03  37 FER1L4 chr20:34146506-34195484 4.56 2.13 −1.105.00E−05 1.05E−03  38 FFAR4 chr10:95326421-95349829 6.61 20.46 1.635.00E−05 1.05E−03  39 FOS chr14:75745480-75748937 16.05 75.85 2.245.00E−05 1.05E−03  40 FOSB chr19:45971252-45978437 0.42 5.31 3.675.00E−05 1.05E−03  41 FOXA2 chr20:22561641-22566101 3.87 10.61 1.465.00E−05 1.05E−03  42 FOXQ1 chr6:1312674-1314993 1.88 0.31 −2.595.00E−05 1.05E−03  43 FREM1 chr9:14734663-14910993 1.05 0.22 −2.275.00E−05 1.05E−03  44 FRMD3 chr9:85857904-86153348 3.61 7.20 0.995.00E−05 1.05E−03  45 FSCN1 chr7:5632435-5646287 7.21 4.41 −0.712.00E−04 3.55E−03  46 GBA3 chr4:22694536-22821195 22.49 8.51 −1.405.00E−05 1.05E−03  47 GBP5 chr1:89724633-89738544 2.01 1.26 −0.682.00E−03 2.43E−02  48 GDF15 chr19:18496967-18499986 6.34 11.32 0.842.50E−04 4.29E−03  49 GPC3 chrX:132669775-133119673 9.68 0.57 −4.085.00E−05 1.05E−03  50 ADGRF1 chr6:46967812-47010082 1.23 0.65 −0.921.60E−03 2.03E−02  51 H19 chr11:2016405-2019065 1.32 0.45 −1.54 1.00E−041.95E−03  52 HOXB13 chr17:46802126-46806111 0.35 54.89 7.28 5.00E−051.05E−03  53 HSD3B2 chr1:119957553-119965662 19.01 1.65 −3.53 5.00E−051.05E−03  54 HSPA2 chr14:65007185-65009954 10.38 15.10 0.54 2.00E−043.55E−03  55 IGFBP2 chr2:217498126-217529158 175.71 60.48 −1.54 5.00E−051.05E−03  56 IGFBP5 chr2:217536827-217560272 18.02 9.16 −0.98 5.00E−051.05E−03  57 INSL5 chr1:67263423-67266942 1.33 77.20 5.86 5.00E−051.05E−03  58 JUN chr1:59246462-59249785 26.56 45.42 0.77 5.00E−051.05E−03  59 KLF8 chrX:56258821-56314322 3.46 1.10 −1.66 5.00E−051.05E−03  60 L1TD1 chr1:62660473-62678001 10.49 2.06 −2.34 5.00E−051.05E−03  61 LINC00261 chr20:22541191-22559280 4.66 13.47 1.53 5.00E−051.05E−03  62 LOC283177 chr11:134306375-134375555 5.19 3.50 −0.571.85E−03 2.29E−02  63 LOC284454 chr19:13945329-13947173 7.67 12.73 0.735.00E−05 1.05E−03  64 LOC389602 chr7:155755325-155759037 1.95 3.35 0.782.20E−03 2.63E−02  65 MFAP5 chr12:8798539-8815433 0.64 2.70 2.095.00E−05 1.05E−03  66 MFSD4 chr1:205538111-205572046 6.75 22.34 1.735.00E−05 1.05E−03  67 MROH6 chr8:144648362-144654928 7.15 5.05 −0.504.75E−03 4.76E−02  68 MS4A12 chr11:60260250-60274901 275.22 460.74 0.745.00E−05 1.05E−03  69 MUC12 chr7:100612903-100662230 12.73 47.67 1.915.00E−05 1.05E−03  70 MUC17 chr7:100663363-100702140 2.49 6.86 1.465.00E−05 1.05E−03  71 NOX1 chrX:100098312-100129334 24.77 61.03 1.305.00E−05 1.05E−03  72 NPY6R chr5:137136881-137146439 5.12 2.61 −0.975.00E−05 1.05E−03  73 NQO1 chr16:69743303-69760571 94.80 58.34 −0.705.00E−05 1.05E−03  74 NR1H4 chr12:100867550-100957645 18.23 5.09 −1.845.00E−05 1.05E−03  75 NR4A1 chr12:52416615-52453291 4.34 8.55 0.985.00E−05 1.05E−03  76 NR4A2 chr2:157180943-157189287 0.87 1.62 0.894.50E−04 6.98E−03  77 NT5DC3 chr12:104166080-104234975 1.08 2.19 1.025.00E−05 1.05E−03  78 PCSK1 chr5:95726039-95768985 0.59 1.29 1.125.00E−05 1.05E−03  79 PDE3A chr12:20522178-20837041 8.91 14.93 0.755.00E−05 1.05E−03  80 PDZK1IP1 chr1:47649260-47655771 22.03 35.74 0.705.00E−05 1.05E−03  81 PITX2 chr4:111538579-11I563279 46.39 0.92 −5.665.00E−05 1.05E−03  82 PLLP chr16:57290008-57318584 7.47 14.23 0.935.00E−05 1.05E−03  83 PP7080 chr5:470624-473080 581.11 130.86 −2.155.00E−05 1.05E−03  84 PPP1R12B chr1:202317829-202557697 6.35 13.73 1.115.00E−05 1.05E−03  85 PPP1R15A chr19:49375648-49379319 13.12 20.35 0.635.00E−05 1.05E−03  86 PRAC1 chr17:46799081-46799882 1.57 198.20 6.985.00E−05 1.05E−03  87 PTGDS chr9:139871955-139876194 32.14 10.87 −1.565.00E−05 1.05E−03  88 RBP4 chr10:95351592-95360993 4.80 16.22 1.765.00E−05 1.05E−03  89 RGS1 chr1:192544856-192549159 5.19 8.46 0.712.50E−04 4.29E−03  90 RHBDL2 chr1:39351478-39407456 11.40 21.90 0.945.00E−05 1.05E−03  91 SCG2 chr2:224461657-224467217 1.32 3.78 1.525.00E−05 1.05E−03  92 SDR16C5 chr8:57212569-57233241 2.53 5.28 1.065.00E−05 1.05E−03  93 SIDT1 chr3:113251217-113348422 6.13 11.75 0.945.00E−05 1.05E−03  94 SIK1 chr21:44834397-44847002 1.80 2.80 0.634.95E−03 4.91E−02  95 SLC14A2 chr18:42792946-43263060 10.71 0.12 −6.515.00E−05 1.05E−03  96 SLC15A1 chr13:99336054-99404929 1.94 10.44 2.435.00E−05 1.05E−03  97 SLC37A2 chr11:124933012-124960412 202.30 5.51−5.20 5.00E−05 1.05E−03  98 SLC51A chr3:195943382-195960301 99.86 27.31−1.87 5.00E−05 1.05E−03  99 SLC9A3 chr5:473333-524549 411.65 94.88 −2.125.00E−05 1.05E−03 100 SPINK5 chr5:147443534-147516925 1.93 5.69 1.565.00E−05 1.05E−03 101 SPON1 chr11:13984183-14289679 14.58 36.49 1.325.00E−05 1.05E−03 102 ST3GAL4 chr11:126225539-126284536 5.58 99.06 4.155.00E−05 1.05E−03 103 ST6GALNAC6 chr9:130647599-130667627 33.72 176.772.39 5.00E−05 1.05E−03 104 STOM chr9:124101265-124132582 34.99 22.70−0.62 5.00E−05 1.05E−03 105 SULT1C2 chr2:108905094-108926371 0.59 2.211.90 5.00E−05 1.05E−03 106 SULT2B1 chr19:49055428-49102684 3.57 1.44−1.31 5.00E−05 1.05E−03 107 TBX10 chr11:67398773-67407031 3.85 10.011.38 5.00E−05 1.05E−03 108 TFCP2L1 chr2:121974163-122042778 23.68 38.090.69 5.00E−05 1.05E−03 109 THRB chr3:24158644-24541502 0.96 6.97 2.865.00E−05 1.05E−03 110 TM4SF20 chr2:228226873-228244022 3.44 6.33 0.885.00E−05 1.05E−03 111 TMC5 chr16:19422056-19510434 14.13 20.66 0.555.00E−05 1.05E−03 112 TMEM200A chr6:130687425-130764210 2.41 13.64 2.505.00E−05 1.05E−03 113 TMEM231 chr16:75572014-75590184 2.71 5.24 0.955.00E−05 1.05E−03 114 TMIGD1 chr17:28643365-28661065 82.69 160.65 0.965.00E−05 1.05E−03 115 TNNC1 chr3:52485106-52488057 0.25 1.96 2.981.60E−03 2.03E−02 116 TPH1 chr11:18042083-18062335 3.95 7.71 0.965.00E−05 1.05E−03 117 TUSC3 chr8:15397595-15624158 1.43 4.25 1.575.00E−05 1.05E−03 118 UGT2B7 chr4:69962192-69978705 1.65 4.15 1.335.00E−05 1.05E−03 119 VNN1 chr6:133001996-133035194 0.40 1.19 1.565.00E−05 1.05E−03 120 VWA2 chr10:115999012-116054259 1.22 0.40 −1.635.00E−05 1.05E−03 121 WFDC2 chr20:44098393-44110172 29.90 246.61 3.045.00E−05 1.05E−03

TABLE 8 Number of common genes between 3 different platforms (there are16849 genes common in all the 3 platforms). Illumina Affymetrix platformRNA-seq (IlluminaHumanWGDASLv4) (hgu133plus2) RNA-seq 25268 19181 18989Illumina 19181 19463 17004 (IlluminaHumanWGDASLv4) Affymetrix(hgu133plus2) 18989 17004 20388

TABLE 9 Class probabilties assigned using empirical approach, normalapproximation, shrunken centroid classifier (independent of the summarymetric), and the Cantelli's inequality lower bound when the 18- genesignature from Table 2 is used. Sample True.class SM.standardizedEmpirical.HP Empirical.SSA/P Normal.HP Normal.SSA/P GSM1072010 HP −3.769.91E−01 9.50E−03 1.00E+00 8.40E−05 GSM1072011 HP −4.79 9.97E−012.65E−03 1.00E+00 8.46E−07 GSM1072012 HP −5.26 9.99E−01 1.22E−031.00E+00 7.02E−08 GSM1072013 HP −5.75 9.99E−01 6.67E−04 1.00E+004.38E−09 GSM1072014 HP −5.54 9.99E−01 9.00E−04 1.00E+00 1.54E−08GSM1072015 HP −6.97 1.00E+00 1.25E−04 1.00E+00 1.56E−12 GSM1072016 SSA/P  3.50 8.42E−04 9.99E−01 2.32E−04 1.00E+00 GSM1072017 SSA/P   7.390.00E+00 1.00E+00 7.39E−14 1.00E+00 GSM1072018 SSA/P   5.97 1.67E−051.00E+00 1.19E−09 1.00E+00 GSM1072019 SSA/P   7.70 0.00E+00 1.00E+006.77E−15 1.00E+00 GSM1072020 SSA/P   7.29 0.00E+00 1.00E+00 1.54E−131.00E+00 GSM1072021 SSA/P   2.48 5.48E−03 9.95E−01 6.56E−03 9.93E−01Sample CLB.HP CLB.SSA/P CLB.decision SCC.HP SCC.SSA/P SCC.decisionGSM1072010 9.34E−01 0.00E+00 HP 8.34E−01 1.66E−01 HP GSM1072011 9.58E−010.00E+00 HP 9.32E−01 6.78E−02 HP GSM1072012 9.65E−01 0.00E+00 HP9.62E−01 3.75E−02 HP GSM1072013 9.71E−01 0.00E+00 HP 9.61E−01 3.86E−02HP GSM1072014 9.68E−01 0.00E+00 HP 9.64E−01 3.62E−02 HP GSM10720159.80E−01 0.00E+00 HP 9.88E−01 1.16E−02 HP GSM1072016 0.00E+00 9.25E−01SSA/P 1.10E−02 9.89E−01 SSA/P GSM1072017 0.00E+00 9.82E−01 SSA/P6.09E−04 9.99E−01 SSA/P GSM1072018 0.00E+00 9.73E−01 SSA/P 2.00E−039.98E−01 SSA/P GSM1072019 0.00E+00 9.83E−01 SSA/P 4.53E−04 1.00E+00SSA/P GSM1072020 0.00E+00 9.82E−01 SSA/P 6.30E−04 9.99E−01 SSA/PGSM1072021 0.00E+00 8.60E−01 SSA/P 2.60E−02 9.74E−01 SSA/P

TABLE 10 Class probabilties assigned using empirical approach, normalapproximation, shrunken centroid classifier (independent of the summarymetric), and the Cantelli's inequality lower bound when the 16- genesignature from Table 2 is used. Sample True.class SM.standardizedEmpirical.HP Empirical.SSA/P Normal.HP Normal.SSA/P G5M270797.CEL HP0.38 3.33E−01 6.67E−01 3.52E−01 6.48E−01 G5M270798.CEL HP −4.70 9.97E−013.03E−03 1.00E+00 1.27E−06 G5M270799.CEL HP −5.12 9.98E−01 1.53E−031.00E+00 1.54E−07 G5M270800.CEL HP −5.79 9.99E−01 6.50E−04 1.00E+003.50E−09 G5M270801.CEL HP −5.44 9.99E−01 1.00E−03 1.00E+00 2.73E−08G5M270802.CEL HP −0.76 8.50E−01 1.50E−01 7.75E−01 2.25E−01 G5M270803.CELHP −5.40 9.99E−01 1.03E−03 1.00E+00 3.26E−08 G5M270804.CEL HP −4.169.94E−01 6.05E−03 1.00E+00 1.62E−05 G5M270805.CEL HP −3.26 9.85E−011.53E−02 9.99E−01 5.49E−04 G5M270806.CEL HP −2.35 9.72E−01 2.77E−029.91E−01 9.44E−03 G5M270807.CEL HP   4.35 2.00E−04 1.00E+00 6.89E−061.00E+00 G5M1100490_EXT_417.CEL SSA/P   7.82 0.00E+00 1.00E+00 2.60E−151.00E+00 G5M1100491_EXT_418.CEL SSA/P   9.36 0.00E+00 1.00E+00 3.85E−211.00E+00 G5M1100492_EXT_419.CEL SSA/P   6.42 0.00E+00 1.00E+00 6.64E−111.00E+00 GSM1100493_EXT_420.CEL SSA/P   5.44 1.67E−05 1.00E+00 2.62E−081.00E+00 G5M1100494_EXT_421.CEL SSA/P   7.35 0.00E+00 1.00E+00 9.64E−141.00E+00 G5M1100495_EXT_422.CEL SSA/P   7.75 0.00E+00 1.00E+00 4.55E−151.00E+00 Sample CLB.HP CLBSSA/P SCC.decision SCC.HP SCC.SSA/PSCC.decision G5M270797.CEL 0.00E+00 1.26E−01 uncertain 5.35E−02 9.46E−01SSA/P G5M270798.CEL 9.57E−01 0.00E+00 HP 9.71E−01 2.93E−02 HPG5M270799.CEL 9.63E−01 0.00E+00 HP 9.93E−01 6.80E−03 HP G5M270800.CEL9.71E−01 0.00E+00 HP 9.95E−01 5.45E−03 HP G5M270801.CEL 9.67E−010.00E+00 HP 9.91E−01 8.82E−03 HP G5M270802.CEL 3.63E−01 0.00E+00uncertain 2.42E−01 7.58E−01 SSA/P G5M270803.CEL 9.67E−01 0.00E+00 HP9.88E−01 1.24E−02 HP G5M270804.CEL 9.45E−01 0.00E+00 HP 9.75E−012.52E−02 HP G5M270805.CEL 9.14E−01 0.00E+00 HP 8.30E−01 1.70E−01 HPG5M270806.CEL 8.46E−01 0.00E+00 HP 7.97E−01 2.03E−01 HP G5M270807.CEL0.00E+00 9.50E−01 SSA/P 8.33E−04 9.99E−01 SSA/P GSM1100490_EXT_417.CEL0.00E+00 9.84E−01 SSA/P 1.56E−05 1.00E+00 SSA/P G5M1100491_EXT_418.CEL0.00E+00 9.89E−01 SSA/P 2.91E−06 1.00E+00 SSA/P G5M1100492_EXT_419.CEL0.00E+00 9.76E−01 SSA/P 1.21E−04 1.00E+00 SSA/P GSM1100493_EXT_420.CEL0.00E+00 9.67E−01 SSA/P 2.68E−04 1.00E+00 SSA/P GSM1100494_EXT_421.CEL0.00E+00 9.82E−01 SSA/P 2.97E−05 1.00E+00 SSA/P GSM1100495_EXT_422.CEL0.00E+00 9.84E−01 SSA/P 1.94E−05 1.00E+00 SSA/P

TABLE 11 Class probabilties assigned using empirical approach, normalapproximation, shrunken centroid classifier (independent of the summarymetric), and the Cantelli's inequality lower bound when the 13- genesignature from Table 2 is used. Illumina samples Sample True.classSM.standardized Empirical.HP Empirical.SSA/P Normal.HP Normal.SSA/PG5M1072010 HP −3.01 9.82E−01 1.84E−02 9.99E−01 1.30E−03 G5M1072011 HP−4.54 9.96E−01 3.74E−03 1.00E+00 2.84E−06 G5M1072012 HP −4.35 9.95E−014.77E−03 1.00E+00 6.72E−06 G5M1072013 HP −5.28 9.99E−01 1.17E−031.00E+00 6.34E−08 G5M1072014 HP −3.60 9.89E−01 1.13E−02 1.00E+001.59E−04 G5M1072015 HP −5.02 9.98E−01 1.87E−03 1.00E+00 2.52E−07G5M1072016 SSA/P   3.79 5.50E−04 9.99E−01 7.69E−05 1.00E+00 G5M1072017SSA/P   7.28 0.00E+00 1.00E+00 1.70E−13 1.00E+00 G5M1072018 SSA/P   5.731.67E−05 1.00E+00 5.04E−09 1.00E+00 G5M1072019 SSA/P   6.92 0.00E+001.00E+00 2.21E−12 1.00E+00 G5M1072020 SSA/P   6.94 0.00E+00 1.00E+001.93E−12 1.00E+00 G5M1072021 SSA/P   2.12 1.11E−02 9.89E−01 1.70E−029.83E−01 Affymetrix samples Sample True.class SM.standardizedEmpirical.HP Empirical.SSA/P Normal.HP Normal.SSA/P G5M270797.CEL HP  2.12 1.11E−02 9.89E−01 1.71E−02 9.83E−01 G5M270798.CEL HP −4.319.95E−01 4.97E−03 1.00E+00 8.07E−06 G5M270799.CEL HP −6.04 1.00E+004.50E−04 1.00E+00 7.69E−10 G5M270800.CEL HP −5.42 9.99E−01 1.01E−031.00E+00 2.96E−08 G5M270801.CEL HP −5.00 9.98E−01 1.92E−03 1.00E+002.82E−07 G5M270802.CEL HP −0.50 7.74E−01 2.26E−01 6.91E−01 3.09E−01G5M270803.CEL HP −5.39 9.99E−01 1.06E−03 1.00E+00 3.61E−08 G5M270804.CELHP −5.12 9.98E−01 1.53E−03 1.00E+00 1.53E−07 G5M270805.CEL HP −2.919.80E−01 1.98E−02 9.98E−01 1.82E−03 G5M270806.CEL HP −2.95 9.81E−011.91E−02 9.98E−01 1.57E−03 G5M270807.CEL HP   4.33 2.00E−04 1.00E+007.36E−06 1.00E+00 G5M1100490_EXT_417.CEL SSA/P   6.69 0.00E+00 1.00E+001.15E−11 1.00E+00 G5M1100491_EXT_418.CEL SSA/P   9.11 0.00E+00 1.00E+004.02E−20 1.00E+00 G5M1100492_EXT_419.CEL SSA/P   4.72 9.17E−05 1.00E+001.20E−06 1.00E+00 G5M1100493_EXT_420.CEL SSA/P   5.20 2.50E−05 1.00E+001.01E−07 1.00E+00 G5M1100494_EXT_421.CEL SSA/P   4.83 7.50E−05 1.00E+006.99E−07 1.00E+00 G5M1100495_EXT_422.CEL SSA/P   6.04 8.33E−06 1.00E+007.64E−10 1.00E+00 IIlumina samples Sample CLB.HP CLB.SSA/P CLB.decisionSCC.HP SCC.SSA/P SCC.decision G5M1072010 9.01E−01 0.00E+00 HP 5.54E−014.46E−01 HP G5M1072011 9.54E−01 0.00E+00 HP 8.74E−01 1.26E−01 HPG5M1072012 9.50E−01 0.00E+00 HP 8.15E−01 1.85E−01 HP G5M1072013 9.65E−010.00E+00 HP 8.77E−01 1.23E−01 HP G5M1072014 9.28E−01 0.00E+00 HP8.03E−01 1.97E−01 HP G5M1072015 9.62E−01 0.00E+00 HP 8.79E−01 1.21E−01HP G5M1072016 0.00E+00 9.35E−01 SSA/P 2.39E−02 9.76E−01 SSA/P G5M10720170.00E+00 9.81E−01 SSA/P 2.28E−03 9.98E−01 SSA/P G5M1072018 0.00E+009.70E−01 SSA/P 5.74E−03 9.94E−01 SSA/P G5M1072019 0.00E+00 9.80E−01SSA/P 3.38E−03 9.97E−01 SSA/P G5M1072020 0.00E+00 9.80E−01 SSA/P3.16E−03 9.97E−01 SSA/P G5M1072021 0.00E+00 8.18E−01 SSA/P 6.93E−029.31E−01 SSA/P Affymetrix samples Sample CLB.HP CLB.SSA.P CLB.decisionSCC.HP SCC.SSA/P SCC.decision G5M270797.CEL 0.00E+00 8.18E−01 SSA/P5.26E−02 9.47E−01 SSA/P G5M270798.CEL 9.49E−01 0.00E+00 HP 8.41E−011.59E−01 HP G5M270799.CEL 9.73E−01 0.00E+00 HP 9.63E−01 3.74E−02 HPG5M270800.CEL 9.67E−01 0.00E+00 HP 9.19E−01 8.13E−02 HP G5M270801.CEL9.62E−01 0.00E+00 HP 8.90E−01 1.10E−01 HP G5M270802.CEL 2.00E−010.00E+00 uncertain 2.79E−01 7.21E−01 SSA/P G5M270803.CEL 9.67E−010.00E+00 HP 8.85E−01 1.15E−01 HP G5M270804.CEL 9.63E−01 0.00E+00 HP9.32E−01 6.84E−02 HP G5M270805.CEL 8.94E−01 0.00E+00 HP 6.52E−013.48E−01 HP G5M270806.CEL 8.97E−01 0.00E+00 HP 7.36E−01 2.64E−01 HPG5M270807.CEL 0.00E+00 9.49E−01 SSA/P 7.19E−03 9.93E−01 SSA/PG5M1100490_EXT_417.CEL 0.00E+00 9.78E−01 SSA/P 1.42E−03 9.99E−01 SSA/PG5M1100491_EXT_418.CEL 0.00E+00 9.88E−01 SSA/P 4.49E−04 1.00E+00 SSA/PG5M1100492_EXT_419.CEL 0.00E+00 9.57E−01 SSA/P 1.63E−02 9.84E−01 SSA/PGSM1100493_EXT_420.CEL 0.00E+00 9.64E−01 SSA/P 9.54E−03 9.90E−01 SSA/PG5M1100494_EXT_421.CEL 0.00E+00 9.59E−01 SSA/P 5.76E−03 9.94E−01 SSA/PG5M1100495_EXT_422.CEL 0.00E+00 9.73E−01 SSA/P 1.76E−03 9.98E−01 SSA/P

TABLE 12 Normalized expression levels (median and MAD) obtained by qPCRfrom 45 independent FFPE samples and the classification result obtainedusing the summary metric (SM) of the 13 genes molecular signature withdifferent sample normalizations. sample name class SPIRE1 KIZ MEGF6SLC7A9 PLA2G16 NTRK2 CHFR HP1 HP −0.79 0.00 1.28 0.51 −0.02 1.61 1.73HP2 HP −0.48 −0.75 −0.09 −0.02 0.06 −1.77 −0.02 HP3 HP −0.79 −1.08 0.390.48 2.48 0.90 1.66 HP4 HP 0.08 0.56 −1.28 −0.63 −0.12 0.74 0.38 HP5 HP0.01 0.64 −0.85 0.49 −0.54 0.42 0.68 HP6 HP 0.36 1.37 −1.05 −0.40 −0.452.06 0.57 HP7 HP −0.79 0.87 −1.87 −1.54 0.10 2.09 1.18 HP8 HP 1.80 −2.130.07 −0.30 −0.65 0.42 1.15 HP9 HP −0.44 −1.16 −3.80 −1.13 −0.73 0.46−0.28 HP10 HP −0.17 0.31 −2.32 −1.41 −0.12 0.63 0.49 HP11 HP 1.25 0.93−0.22 0.55 −0.12 2.28 2.19 HP12 HP 0.86 0.56 −0.98 −0.17 −0.12 2.42 2.02HP13 HP 0.89 0.96 −0.83 −0.27 −0.20 1.21 1.35 HP14 HP 0.97 1.66 0.002.44 −0.12 2.99 2.39 HP15 HP 0.70 0.87 −0.43 0.60 −0.38 1.42 1.57 HP16HP 0.23 −0.13 −2.80 −1.43 −1.23 2.35 0.68 HP17 HP 0.90 1.30 0.35 0.00−0.12 1.24 1.38 HP18 HP −0.12 0.60 −2.30 −0.70 −0.12 1.71 −0.62 HP19 HP0.17 −0.31 −2.58 −1.03 0.24 −1.89 −0.28 HP20 HP 0.64 −0.40 0.40 0.110.00 0.16 2.04 HP21 HP 1.00 1.29 0.58 2.12 0.24 0.00 1.80 HP22 HP −0.70−1.79 −3.23 −4.21 0.06 0.00 −0.28 HP23 HP 1.41 1.49 −0.41 0.86 −0.372.00 1.87 HP24 HP 0.92 1.33 −0.06 0.59 −0.12 0.06 1.89 SSA/P1 SSP −0.79−1.10 0.70 1.74 0.77 −2.48 −0.97 SSA/P2 SSP −0.68 −0.26 −0.36 1.05 0.03−2.28 0.00 SSA/P3 SSP −1.14 −0.49 0.07 −0.09 0.50 −2.92 −0.23 SSA/P4 SSP−0.79 −0.62 0.72 −0.11 0.14 −0.72 0.29 SSA/P5 SSP −0.42 −0.75 0.19 −1.490.58 −1.49 −0.01 SSA/P6 SSP −1.51 −1.17 −1.40 1.57 −0.12 −1.58 −1.69SSA/P7 SSP −1.12 0.51 1.62 −0.37 1.43 −4.25 −2.40 SSA/P8 SSP −1.17 −0.641.56 0.75 1.39 −3.74 −1.27 SSA/P9 SSP −0.14 0.14 0.27 0.01 −0.77 −0.60−0.12 SSA/P10 SSP −0.77 −0.14 0.41 0.00 −1.33 −1.26 −0.67 SSA/P11 SSP0.00 −0.09 1.73 −0.55 0.70 −4.78 −1.93 SSA/P12 SSP −0.58 0.99 0.24 2.421.93 −3.01 −0.92 SSA/P13 SSP 0.42 0.41 0.60 −0.10 −0.26 0.82 0.15SSA/P14 SSP 0.03 −0.38 −0.01 1.83 0.63 −1.48 −1.21 SSA/P15 SSP 0.07 0.555.39 2.40 1.51 −3.51 −1.00 SSA/P16 SSP −0.79 −1.18 2.29 1.32 0.22 −3.46−2.31 SSA/P17 SSP 1.24 3.15 4.63 1.31 2.04 −1.40 −0.94 SSA/P18 SSP −0.341.73 1.99 −0.96 1.03 −1.26 0.16 SSA/P19 SSP 0.74 −1.42 1.13 −0.54 −0.12−2.60 −0.28 SSA/P20 SSP −0.21 −0.58 −1.93 −0.81 0.15 −2.30 −0.65 SSA/P21SSP 0.13 −0.40 −0.80 1.14 0.06 −2.26 −0.74 SM with SM with SM withgeometric median and mean and mean and MAD MAD MAD normal- normal-normal- sample name CHGA PTAFR CLDN1 TACSTD2 SEMG1 SBSPON izationization ization HP1 2.05 −0.12 −0.75 −0.67 −0.43 1.60 −0.37 −0.72 −0.70HP2 0.00 0.22 −2.20 0.25 −1.04 1.08 −0.09 0.00 0.04 HP3 0.84 −0.17 −0.16−2.14 −3.07 1.89 −0.43 −0.58 −0.58 HP4 1.22 −0.02 0.01 −2.00 −0.91 −1.04−0.59 −0.57 −0.61 HP5 1.17 0.83 −0.32 −0.90 1.19 −0.19 −0.15 −0.36 −0.35HP6 2.90 0.71 −0.32 −0.14 −0.96 −1.11 −0.58 −0.83 −0.93 HP7 2.37 0.07−2.90 −1.55 0.93 −0.85 −1.01 −1.04 −1.11 HP8 2.28 0.20 0.23 −1.28 2.071.95 −0.15 −0.49 −0.49 HP9 1.00 0.04 0.01 −0.15 −2.35 −4.50 −1.19 −0.75−0.84 HP10 0.68 0.19 −0.59 −1.60 −1.58 0.05 −0.70 −0.57 −0.59 HP11 3.321.24 −0.60 −0.88 −1.34 0.33 −0.51 −0.99 −1.05 HP12 1.55 −0.15 0.41 −0.78−0.30 −3.26 −0.76 −0.95 −1.01 HP13 2.32 0.00 −1.61 0.86 −3.56 −1.03−0.75 −0.85 −0.92 HP14 2.55 1.18 −1.43 −0.77 −1.46 −0.99 −0.50 −0.99−1.05 HP15 1.65 1.15 −0.32 −1.18 1.48 1.72 −0.03 −0.50 −0.48 HP16 0.770.00 0.72 −1.82 −0.36 −2.12 −0.98 −0.87 −0.92 HP17 2.46 0.46 −0.65 −2.08−0.61 0.54 −0.38 −0.70 −0.74 HP18 0.68 0.24 0.47 −2.25 0.56 −0.06 −0.42−0.45 −0.46 HP19 0.93 1.45 1.71 −0.41 1.71 −2.10 0.01 0.00 −0.02 HP201.91 0.00 −3.56 −0.36 −1.23 0.39 −0.62 −0.73 −0.74 HP21 2.28 0.86 −0.32−0.71 −1.31 2.68 0.18 −0.36 −0.34 HP22 −0.84 −0.11 0.40 0.13 −1.57 −3.05−1.00 −0.47 −0.50 HP23 1.85 0.66 −0.32 −1.24 0.72 0.95 −0.15 −0.65 −0.66HP24 3.77 1.09 −2.15 −0.35 −0.65 1.67 −0.27 −0.70 −0.75 SSA/P1 −4.260.52 0.24 0.47 1.61 1.91 1.06 1.02 1.10 SSA/P2 −2.20 −0.07 −0.32 1.48−0.28 0.00 0.39 0.45 0.49 SSA/P3 −1.77 0.00 0.00 2.28 1.50 0.81 0.640.60 0.64 SSA/P4 −2.20 −0.68 −0.07 0.25 −0.92 −2.56 −0.15 0.04 0.07SSA/P5 −0.80 0.00 −0.13 0.00 0.00 −1.75 −0.11 0.03 0.05 SSA/P6 −3.210.18 0.05 0.38 2.77 −1.88 0.41 0.62 0.67 SSA/P7 −3.25 −0.42 1.56 2.433.42 0.69 1.51 1.41 1.43 SSA/P8 −2.19 0.00 −0.15 1.33 0.77 −0.53 0.810.87 0.91 SSA/P9 −1.05 −0.35 0.86 0.79 2.09 −0.56 0.32 0.19 0.23 SSA/P10−1.11 0.00 0.16 2.06 2.11 −1.38 0.32 0.30 0.33 SSA/P11 −4.12 −0.02 0.342.32 2.09 0.30 1.36 1.42 1.44 SSA/P12 −4.28 −0.14 −0.32 1.19 2.84 5.321.70 1.36 1.41 SSA/P13 −0.65 −0.71 1.16 −0.10 1.43 −0.19 0.18 −0.05 0.00SSA/P14 −2.76 0.00 1.06 2.01 1.12 0.44 0.94 0.78 0.84 SSA/P15 −4.76 0.001.29 3.55 1.35 3.03 2.19 1.67 1.68 SSA/P16 −3.85 0.15 0.56 0.64 −0.09−2.61 0.78 1.05 1.08 SSA/P17 −5.16 0.64 1.60 0.25 1.27 0.92 1.89 1.391.38 SSA/P18 −2.23 −0.14 −0.69 0.25 0.70 0.50 0.57 0.44 0.46 SSA/P19−5.99 −0.11 0.44 2.95 −2.17 0.28 0.77 0.99 1.01 SSA/P20 −2.01 0.00 0.720.94 −1.08 −1.73 0.03 0.32 0.34 SSA/P21 −3.25 0.37 1.44 2.57 0.89 −1.070.81 0.79 0.83

TABLE 13 Raw expression levels of 13 genes in the molecular signatureobtained by qPCR from 45 independent FFPE samples. sample name classSPIRE1 KIZ MEGF6 SLC7A9 PLA2G16 NTRK2 HP1 HP 16.77 14.64 16.1 19.4416.67 15.16 HP2 HP 15.94 14.87 16.94 19.44 16.06 18.01 HP3 HP 16.4 15.3516.61 19.09 13.8 15.5 HP4 HP 11.41 9.59 14.16 16.08 12.28 11.54 HP5 HP13.33 11.37 15.59 16.81 14.55 13.71 HP6 HP 11.11 8.76 13.92 15.84 12.5910.2 HP7 HP 13.91 10.91 16.39 18.62 13.68 11.81 HP8 HP 11.69 14.27 14.8117.75 14.8 13.85 HP9 HP 11.68 11.07 16.44 16.34 12.64 11.57 HP10 HP 13.111.29 16.65 18.31 13.73 13.1 HP11 HP 12.75 11.74 15.62 17.42 14.79 12.51HP12 HP 13.04 12.01 16.27 18.04 14.69 12.27 HP13 HP 12.97 11.56 16.0818.09 14.72 13.43 HP14 HP 13.05 11.01 15.41 15.55 14.81 11.82 HP15 HP14.59 13.09 17.12 18.66 16.34 14.66 HP16 HP 12.17 11.19 16.59 17.7914.29 10.84 HP17 HP 12.72 10.98 14.66 17.58 14.41 13.17 HP18 HP 13 10.9316.57 17.54 13.67 11.96 HP19 HP 13.75 12.89 17.9 18.91 14.35 16.6 HP20HP 14.4 14.1 16.04 18.9 15.71 15.67 HP21 HP 15.55 13.92 17.37 18.4 16.9817.34 HP22 HP 14.2 13.96 18.14 21.68 14.12 14.3 HP23 HP 12.66 11.2415.88 17.18 15.11 12.86 HP24 HP 14.44 12.69 16.82 18.73 16.15 16.09SSA/P1 SSA/P 12.28 11.25 12.19 13.72 11.39 14.76 SSA/P2 SSA/P 11.57 9.8112.65 13.81 11.54 13.96 SSA/P3 SSA/P 12.13 10.14 12.32 15.05 11.16 14.7SSA/P4 SSA/P 12.83 11.32 12.71 16.12 12.57 13.55 SSA/P5 SSA/P 12.6511.64 13.43 17.69 12.32 14.51 SSA/P6 SSA/P 12.69 11 13.97 13.57 11.9613.55 SSA/P7 SSA/P 15.63 12.65 14.28 18.83 13.74 19.55 SSA/P8 SSA/P15.16 13.29 13.83 17.2 13.27 18.52 SSA/P9 SSA/P 14.75 13.13 15.74 18.5716.05 16 SSA/P10 SSA/P 15.7 13.74 15.91 18.9 16.93 16.98 SSA/P11 SSA/P14.09 12.84 13.75 18.61 14.07 19.67 SSA/P12 SSA/P 13.25 10.34 13.8314.21 11.41 16.47 SSA/P13 SSA/P 15.13 13.8 16.34 19.62 16.48 15.52SSA/P14 SSA/P 14.22 13.29 15.66 16.38 14.29 16.53 SSA/P15 SSA/P 15.0613.24 11.14 16.7 14.29 19.43 SSA/P16 SSA/P 14.22 13.28 12.54 16.08 13.8817.69 SSA/P17 SSA/P 13.19 9.94 11.2 17.09 13.06 16.62 SSA/P18 SSA/P14.12 10.72 13.19 18.71 13.43 15.84 SSA/P19 SSA/P 13.7 14.52 14.7 18.9515.23 17.83 SSA/P20 SSA/P 14.13 13.17 17.25 18.7 14.45 17.01 SSA/P21SSA/P 13.63 12.82 15.95 16.58 14.36 16.81 sample name CHFR CHGA PTAFRCLDN1 TACSTD2 SEMG1 SBSPON HP1 14.76 11.47 16.89 17.2 17.69 19.29 18.44HP2 15.98 13 16.03 18.13 16.25 19.37 18.43 HP3 14.46 12.31 16.56 16.2418.79 21.55 17.77 HP4 11.61 7.81 12.3 11.95 14.53 15.27 16.58 HP5 13.169.71 13.3 14.13 15.28 15.03 17.59 HP6 11.4 6.11 11.55 12.26 12.65 15.3116.64 HP7 12.44 8.28 13.83 16.49 15.71 15.06 18.02 HP8 12.84 8.74 14.0713.72 15.81 14.29 15.59 HP9 12.03 7.78 11.99 11.71 12.44 16.47 19.8 HP1012.96 9.8 13.54 14 15.58 17.4 16.95 HP11 12.32 8.22 13.55 15.08 15.9318.22 17.73 HP12 12.39 9.9 14.84 13.97 15.73 17.08 21.21 HP13 13.01 9.0714.65 15.94 14.04 20.29 18.94 HP14 12.14 9.01 13.62 15.92 15.83 18.3619.06 HP15 14.23 11.18 14.93 16.08 17.52 16.7 17.62 HP16 12.22 9.1713.19 12.15 15.26 15.63 18.57 HP17 12.74 8.7 13.95 14.74 16.74 17.117.13 HP18 14 9.73 13.43 12.88 16.17 15.2 16.99 HP19 14.71 10.52 13.2612.69 15.38 15.09 20.07 HP20 13.51 10.68 15.83 19.08 16.44 19.15 18.71HP21 15.25 11.82 16.48 17.34 18.31 20.74 17.93 HP22 14.29 11.89 14.4113.58 14.42 17.96 20.61 HP23 12.7 9.76 14.2 14.86 16.36 16.23 17.18 HP2413.97 9.13 15.06 17.99 16.76 18.89 17.75 SSA/P1 12.97 13.29 11.76 11.7312.07 12.76 13.64 SSA/P2 11.4 10.63 11.75 11.68 10.46 14.05 14.95 SSA/P311.72 10.31 11.78 11.46 9.75 12.37 14.24 SSA/P4 12.26 11.79 13.51 12.5912.84 15.84 18.66 SSA/P5 12.74 10.57 13.02 12.84 13.27 15.11 18.03SSA/P6 13.37 11.92 11.78 11.59 11.84 11.29 17.11 SSA/P7 17.41 15.2915.71 13.41 13.11 13.96 17.86 SSA/P8 15.77 13.72 14.78 14.61 13.71 16.118.58 SSA/P9 15.24 13.21 15.75 14.23 14.87 15.4 19.23 SSA/P10 16.1113.58 15.72 15.24 13.91 15.7 20.37 SSA/P11 16.53 15.76 14.91 14.22 12.8114.88 17.85 SSA/P12 14.1 14.49 13.6 13.46 12.52 12.71 11.41 SSA/P1315.91 13.74 17.05 14.86 16.7 17 19.79 SSA/P14 15.96 14.55 15.04 13.6713.28 16.01 17.87 SSA/P15 16.64 17.43 15.92 14.31 12.62 16.66 16.15SSA/P16 16.25 14.83 14.07 13.35 13.84 16.4 20.1 SSA/P17 15.87 17.1414.58 13.31 15.22 16.04 17.57 SSA/P18 14.14 13.56 14.71 14.95 14.5815.97 17.34 SSA/P19 15.23 17.97 15.34 14.47 12.54 19.49 18.21 SSA/P2015.08 13.48 14.72 13.68 14.03 17.88 19.71 SSA/P21 15 14.55 14.18 12.7912.23 15.74 18.88

What is claimed is:
 1. A method of detecting a sessile serratedadenoma/polyp (SSA/P) in a subject, the method comprising: a.determining the level of expression of nucleic acids in a molecularsignature in a biological sample obtained from the subject, wherein themolecular signature consists of CHFR, CHGA, CLDN1, KIZ, MEGF6, NTRK2,PLA2G16, PTAFR, SBSPON, SMEG1, SLC7A9, SPIRE1, and TACTD2, andoptionally includes one or more of FOXD1, PIK3R3, PRUNE2, TPD52L1,TRIB2, C4BPA, CPE, DPP10, GRAMD1B, GRIN2D, KLK7, MYNC, TM4SF4 and a oneor more nucleic acid used as a normalization control; b. comparing thelevel of expression of each nucleic acid in the molecular signature to areference value; c. detecting a SSA/P in the subject based on the levelof expression of each nucleic acid in the molecular signature relativeto the reference value, wherein SSA/P is detected when CHFR, CHGA, andNTRK2 are decreased relative to the reference value, and when CLDN1,KIZ, MEGF6, PLA2G16, PTAFR, SBSPON, SMEG1, SLC7A0, SPIRE1, and TACSTD2are increased relative to the reference value, and the reference valueis the level of expression of each nucleic acid in the molecularsignature in a non-diseased sample or hyperplastic polyp sample; and d.removing the SSA/P by a method selected from polypectomy, endoscopicresection and surgical resection.
 2. The method of claim 1, wherein theone or more nucleic acids used as a normalization control are selectedfrom the group consisting of GAPDH, ACTB, B2M, TUBA, G6PD, LDHA, HPRT,ALDOA, PFKP, PGK1, PGAM1, VIM and UBC.
 3. The method of claim 1, whereinthe method to determine the level of expression of the nucleic acids inthe molecular signature is microarray, RNA-seq or real-time qPCR.
 4. Themethod of claim 1, wherein the biological sample is a tissue biopsy.